dovydas.com blog

p-values

Dovydas Joksas — Thu, 29 Feb 2024 19:15 +0000

I lead math workshops for second-year electronic engineering students at UCL, and statistics has always been the most challenging topic to teach. There is far too much misuse of frequentist statistics in academia (including academic publishing), medicine, and law. Instead of making the students follow some ritual for rejecting the null hypothesis, my goal is to make sure they get the fundamentals right.

There was a study in 2002 where academics and students from psychology departments of several German universities were asked to fill out a questionnaire. It consisted of 6 statements about p-values which the participants had to mark as true or false. Second year in a row, I’m performing a little experiment—at the beginning of the workshop, I present a similar scenario to my students and ask those same 6 questions. Here’s my version:

p-value is the probability of obtaining a result at least as extreme as the one observed, assuming that the null hypothesis is true.

You have a treatment that you suspect may cure covid.

You compare the means of your control and experimental groups, each of size 100. At the end of the experiment, 50 people in the control group and 60 people in the experimental group do not have covid.

You use a simple independent-means t-test to investigate whether there is a significant difference between the two groups.

You compute p-value of 0.01.

Are the following statements true or false?

You have absolutely disproved the null hypothesis.

You have found the probability of the null hypothesis being true.

You have absolutely proved your alternative hypothesis.

You can deduce the probability of the alternative hypothesis being true.

You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision.

You have a reliable experimental finding that if the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions.

Results from today’s workshop:

0 true, 15 false
14 true, 1 false
0 true, 15 false
13 true, 2 false
13 true, 2 false
1 true, 14 false

All six statements are, in fact, false. The lesson is that people love to attach additional meaning to p-values and other statistical concepts, even when a clear definition is given. It’s absolutely not an issue that the students didn’t get all the answers right—that’s what education is for. The problem is that even the people who are supposed to teach these things often make the same mistakes:

Podcasting 2.0

Dovydas Joksas — Thu, 28 Sep 2023 09:55 -0500

Last Friday, I had a great time on the Podcasting 2.0 show with Adam Curry and Dave Jones!

We discuss RSS Blue (my podcast hosting company), programming languages, open standards for music hosting, monetization, OnlyFans (!), censorship, and the podcast industrial complex with its delusional growth projections and broken ad models.

Listen in your browser or using your favorite podcast player!

Machine Learning With Analog Computers

Dovydas Joksas — Tue, 15 Aug 2023 09:55 +0300

Adnan Mehonic and I contributed a chapter to a new upcoming edition of a textbook on nonvolatile memories. Our part discusses scalability problems in machine learning and how they may be addressed with emerging memory technologies.

I am quite happy with how the chapter turned out. We focused on the fundamentals, which, at least for me, is always the most fun to think and write about. If you’d like to read the chapter for free, there is some great news—we’ve uploaded it to arXiv! Click on the link below:

Emerging Nonvolatile Memories for Machine Learning

Vietnam

Dovydas Joksas — Tue, 18 Jul 2023 10:30 +0300

Earlier this month, I visited Vietnam for the first time in my life. It’s the most beautiful country I’ve been to; hopefully, the photos and videos below can convey at least a small part of this beauty.

Jump to...

Hà Nội

Hà Nội—like many cities in Vietnam—is charmingly chaotic.

Everyone’s a millionaire in Vietnam.

My first meal in Vietnam. Trying to avoid gluten has been relatively easy—noodles are everywhere, but they’re typically made from rice.

I tried my luck too. Fortunately, my epic fail wasn’t captured on camera.

Had an amazing lunch at this small, family-run restaurant.

It rained for 20 minutes. This was the only rain I experienced in Vietnam.

Hỏa Lò Prison, AKA Hanoi Hilton.

John McCain.

The imprints of the French colonial era (St. Joseph’s Cathedral).

The people gathering here are the most powerful, influential, sexiest people in Hà Nội. I even got too excited at 0:21.

Egg coffee—a Vietnamese specialty.

Hà Nội Night Market.

Tràng An

Spending two days in Ninh Bình province has probably been my favorite part of the whole trip. The nature here is so different from anything else I’ve seen before.

Linda is an amazing tour guide. She organised my travels in Northern Vietnam, including Hà Nội, Ninh Bình, and Hạ Long Bay. Whether you’re travelling alone or with a group, Linda is the best person if you want to explore around Hà Nội; you can reach her on Facebook.

Entering a 1-km-long cave.

Inside the cave. Yes, there are bats.

And leaving…

Tuyet Tinh Coc

Tam Coc

Decided to go on an evening bike ride!

Rice fields.

We’d come back to this place in a rowboat the following day.

Hai, doggy!

Thung Nang

We went on a boat trip early in the morning to see the water lilies. Thung Nang is a lot less touristy than Tràng An—two hours spent here were incredibly peaceful.

A fisherman.

Probably my favourite video captured in Vietnam.

Hang Mua

Going up…

View from the top.

Apparently, going up and down the stairs in 34°C weather is hard.

One last meal with Linda and the driver.

Hạ Long Bay

Whenever I stay at the White Lotus, I always have a memorable time.

Hội An

Hội An Market.

Stall E034, which Anthony Bourdain once ate at.

Stall E047.

Not enough food so went on a food tour!

The making of a dumpling.

Back to Hội An Market; this time, Stall E027.

I also went back to E034 to try rice dumplings with 1) green beans and 2) prawns. This probably was my favourite meal in Vietnam.

Old propaganda posters are sold everywhere. Most are prints, but there are a few shops that sell the originals as well.

Hội An Memories Show.

An Bang Beach.

Lots of dangerous animals in Vietnam.

Tailored suit

Day 1

Hội An is known for its tailors. Mr. Xe is among the most popular. I decided to order a suit, two shirts, and a pair of shorts.

Day 2

The first version of the suit was ready in 30 hours!

Day 3

Final fixes.

Day 4

Rename Von Neumann Architecture?

Dovydas Joksas — Sun, 11 Jun 2023 17:05 +0100

The following from Gödel, Escher, Bach caught my attention:

Unlike any previosuly designed machine, [Babbage’s Analytical Engine] was to possess both a “store” (memory) and a “mill” (calculating and decision-making unit). Babbage had a vision of numbers swirling in and out of the mill under control of a program contained in punched cards . . .

What is this if not the von Neumann architecture?

Allan Bromley’s paper on this subject makes the same connection:

The development of the multiplication algorithm played an important role in the history of the Analytical Engine because it led Babbage to the invention of the anticipating carry to speed the addition of the partial products. The complexity of the anticipating carry apparatus led him in turn to the clear separation of the functions of the mill and store, a concept that did not clearly emerge in modern computers until the work of von Neumann.

John von Neumann was a genius, but unless I’m overlooking something, it seems fairer to say that modern computers use the Babbage~~von Neumann~~ architecture. Of course, you have to take the good with the bad, so maybe the Babbage~~von Neumann~~ bottleneck is, too, a more appropriate term.

It Doesn't Matter Which Text Editor You Use

Dovydas Joksas — Wed, 15 Mar 2023 11:30 +0000

Obviously, Neovim is the best text editor. But in 2023, it mostly doesn’t matter. The power of a text editor is in its plugins, and now that we have Language Server Protocol (LSP), you should be able to achieve the same functionality everywhere.

The old way of doing plugins

Suppose you want to write a plugin that indents Rust code in Emacs. Well, you’ll probably need to learn Elisp, dialect of Lisp that Emacs is written in. It will take some time to learn the language, implement parsing, and figure out how to programmatically manipulate text inside Emacs. But it’s doable.

Suppose that after you’ve written the plugin, an intellectual friend of yours asks if he could use it in Vim. Well, not right away. Although the parsing part can be reused, you need to understand how to programmatically manipulate text in Vim. Even though the functionality of the plugin is the same (inserting or removing spaces at the beginnings of lines for various indentation levels), this is probably done differently in Vim and Emacs. But, again, it’s doable.

If you have n text editors, you will need to write n versions of the same plugin. And if you have m different plugins (say, for indenting code in m different languages¹), you will need to write m × n variations of plugins to cover each language and each text editor.

The LSP way

The key thing to recognize here is that the functionality of a given plugin is the same across all text editors. So maybe there is no need to repeat the same work over and over again?

LSP is an attempt to standardize all the things that a text editor might be able to do: insert text, find references, rename variables, display errors, etc. Any editor that supports LSP can be interacted with using the same commands instead of editor-specific languages or configuration files. Then, a given plugin only needs to be implemented once by using features provided by LSP; it will work in any text editor that supports LSP.

If you have n text editors, you will only need to add the support for LSP in each of them once. Similarly, as mentioned before, each of the m plugins only needs to be implemented once by following LSP specification. Although the things being implemented are different, we can essentially say that we reduced the complexity of the problem from m × n to m + n.

So what?

Unlike most standards (see below), LSP seems to have succeeded.

These days, most of the serious text editors support LSP—indeed, it’s become almost a requirement to be taken seriously. That is not to say that all text editors are now equal—they all do things differently. It’s just that there is no longer any point to argue over the technical capabilities of each. With the proliferation of LSP implementations of plugins, it’s rare to be able to achieve something in one text editor but not in some other.

Finally, if you are crazy enough to want to write a new text editor, maybe it’s not such a bad idea—if you add support for LSP, you’ll now automatically have access to an amazing ecosystem of plugins!

Yes, I know, it’s a bit different in Python… ↩︎

Interview for Podnews Weekly Review

Dovydas Joksas — Fri, 17 Feb 2023 13:15 +0000

I was interviewed by the amazing Sam Sethi for Podnews Weekly Review!

We discuss my podcast hosting company RSS Blue, RSS technology, and the evolution of podcasting standards.

Listen in your browser or using your favorite podcast player!

MathML

Dovydas Joksas — Wed, 11 Jan 2023 11:30 +0000

MathML has been integrated into the Chromium engine, which powers browsers like Chrome, Edge, and Brave. It allows to render mathematical formulas in the browser without the use of additional JavaScr*pt libraries. Browsers supporting MathML should display the formula below as a summation on the left-hand side and as a product of an “m” and a second order derivative on the right-hand side.

\sum_{i} {\vec{F}}_{i} = m \frac{d^{2} \vec{r}}{d t^{2}}

Now, all major browsers will support MathML. To ensure no one is left behind, I plan to wait a couple of release cycles before transitioning away from using KaTeX on this site in favor of MathML. I will have to check, but it should probably be possible to convert LaTeX to MathML with pandoc (MathML is not nice to write in).

Less Cynicism

Dovydas Joksas — Sun, 01 Jan 2023 20:05 +0200

That’s my New Year’s resolution. I think being less cynical is a good goal for anyone to have, but I’m failing to verbalize why without sounding too… cynical.

This was triggered by the responses to a Lex Fridman tweet, most of which were like “I read all of these when I was 14, lol”. People being mocked for the books they read (especially, classics) has become quite common. I could somewhat comprehend this if most of us were literary geniuses, but the fact is most people don’t read. So making fun of people for doing that (or anything that helps them become better humans) seems crazy to me.

Personally, I’ll try to recognize this type of behavior in myself and try to stop it.

RIP, Angelo Badalamenti

Dovydas Joksas — Sun, 18 Dec 2022 13:10 +2000

Angelo Badalamenti, the composer most well known for his work with David Lynch, has passed away a few days ago. This video of him discussing how he wrote Laura Palmer’s theme for Twin Peaks is one of my favorite things on the Internet.

Replicator Dynamics

Dovydas Joksas — Tue, 06 Dec 2022 14:10 +0000

Note: This blog post contains LaTeX typography that will not be rendered correctly by RSS readers. To read the blog post with its original formatting, please visit https://dovydas.com/blog/replicator-dynamics/.

Today, I had the pleasure of attending John Carlos Baez’s very interesting talk at the Queen Mary University of London. He presented how information theory can be used to describe replicator dynamics in the context of natural selection, evolutionary algorithms, game theory, and more. You should probably check out his own writing on this topic, but I found the following particularly interesting:

Suppose we have $n$ types of replicators with population sizes $P_1(t), P_2(t), \ldots, P_n(t)$. Suppose also that these populations satisfy the following set of differential equations:

$$ \frac{\mathrm{d} P_i}{\mathrm{d} t} = f_i(P_1, P_2, \ldots, P_n) P_i $$ where $f_i$ is the fitness function of the $i$^th replicator.

Next, if we define $p_i$ to be the fraction of the population that is of type $i$, it can be shown that

$$ \left| \left| \frac{\mathrm{d} p}{\mathrm{d} t} \right| \right|^2 = \sum_i (f_i - \langle f \rangle)^2 p_i $$ The left-hand side is the square of, what Baez calls, “Fisher speed”, or the rate of learning, i.e. the “speed” of the changing probability distribution $p(t) = (p_1(t), p_2(t), \ldots, p_n(t))$. The right-hand side is the variance of the fitness.

Baez interprets this as a revised version of Fisher’s fundamental theorem of natural selection. In English, he states it in the following way:

As a population changes with time, the rate at which information is updated equals the variance of fitness.

Time Series Data and Causality

Dovydas Joksas — Wed, 09 Nov 2022 10:15 +0000

This graph is great for demonstrating the danger of inferring causality from time series data alone.

Taken from https://www.axios.com/2022/10/15/china-party-congress-2022-xi-jinping-third-term.

Adversarial Attacks on Linear Models

Dovydas Joksas — Thu, 15 Sep 2022 15:40 +0100

I’ve been reading about adversarial attacks and wanted to implement something very simple. The simplest adversarial attack: attacking a linear model by Pierre Ablin helped me do just that and this blog post adapts a lot of material from there—check it out!

Setup

Suppose we have a binary classifier that has been trained on input-output pairs $(\boldsymbol{x}_1, \boldsymbol{y}_1), \ldots, (\boldsymbol{x}_n, \boldsymbol{y}_n)$, with $\boldsymbol{x}_i \in \mathbb{R}^p$ and $\boldsymbol{y}_i \in \lbrace -1, 1 \rbrace$. Our goal is to take base input $\boldsymbol{x}_\mathrm{b}$ and transform it to poison input $\boldsymbol{x}_\mathrm{p}$ such that

its class would be guessed incorrectly by the classifier
$\boldsymbol{x}_\mathrm{p}$ would remain similar to $\boldsymbol{x}_\mathrm{b}$

If $\boldsymbol{x}$’s represent images, then “similar” could mean that any introduced changes are visually imperceptible. For example, if we are dealing with two classes of digits, say 3’s and 7’s, from the MNIST database ($p = 28 \times 28 = 784$), then our goal would be to take an image of a ‘7’ and make it look like a ‘3’ to the classifier but keep it looking like a ‘7’ to the human, and vice versa.

We will assume a classifier that makes predictions based on the sign of a linear regression model: $$ y = \text{sign}(\boldsymbol{w}^\top \boldsymbol{x} + \boldsymbol{b}) $$ with $\boldsymbol{w} \in \mathbb{R}^p$ and $\boldsymbol{b} \in \mathbb{R}$.

Thus, the decision boundary of the classifier is a hyperplane $\Pi$ with equation $$ \boldsymbol{w}^\top \boldsymbol{x} + \boldsymbol{b} = 0 $$ i.e. any point $\boldsymbol{x} \in \Pi$ must satisfy the above equation.

Strategy

To have an input $\boldsymbol{x}_\mathrm{p}$ classified differently than $\boldsymbol{x}_\mathrm{b}$, it must lie on the opposite side of the hyperplane¹. To ensure visually imperceptible changes, a sensible strategy would be to move from $\boldsymbol{x}_\mathrm{b}$ as little as possible, i.e. perpendicularly towards the plane.

It might be useful to think of going from $\boldsymbol{x}_\mathrm{b}$ to $\boldsymbol{x}_\mathrm{p}$ in terms of the shortest distance from the hyperplane $\Pi$ to $\boldsymbol{x}_\mathrm{p}$. We can use vector $\boldsymbol{d}$ to denote that:

We can then express $\boldsymbol{x}_\mathrm{p}$ in the following way: $$ \boldsymbol{x}_\mathrm{p} = \boldsymbol{x}_\mathrm{b} - \alpha \boldsymbol{d} $$ with $\alpha \in \mathbb{R}$.

This is useful because

$\alpha < 1$ means $\boldsymbol{x}_\mathrm{p}$ is on the same side as $\boldsymbol{x}_\mathrm{b}$, and the attack is unsuccessful
$\alpha = 1$ means $\boldsymbol{x}_\mathrm{p}$ is on hyperplane $\Pi$
$\alpha > 1$ means $\boldsymbol{x}_\mathrm{p}$ is on the opposite side to $\boldsymbol{x}_\mathrm{b}$, and the attack is successful

However, we should express $\boldsymbol{d}$ in terms of quantities we already know.

Notice that $\boldsymbol{w}$ is perpendicular to the hyperplane and so is in the same (or opposite) direction as $\boldsymbol{d}$. We can thus say that $\boldsymbol{d} = c \boldsymbol{w}$ (with $c \in \mathbb{R}$) and so $$ \boldsymbol{x}_\mathrm{p} = \boldsymbol{x}_\mathrm{b} - \alpha c \boldsymbol{w} $$

Further, we can find $c$ by considering the case of $\alpha = 1$, when $\boldsymbol{x}_\mathrm{p}$ is on the hyperplane $\Pi$ and thus must satisfy its equation: $$ \boldsymbol{w}^\top (\boldsymbol{x}_\mathrm{b} - 1 c \boldsymbol{w}) + \boldsymbol{b} = 0 $$

Solving for $c$ gives $$ c = \frac{\boldsymbol{w}^\top \boldsymbol{x}_\mathrm{b} + \boldsymbol{b}}{\boldsymbol{w}^\top \boldsymbol{w} } $$

and so $$ \boldsymbol{x}_\mathrm{p} = \boldsymbol{x}_\mathrm{b} - \alpha \frac{\boldsymbol{w}^\top \boldsymbol{x}_\mathrm{b} + \boldsymbol{b}}{\boldsymbol{w}^\top \boldsymbol{w} } \boldsymbol{w} $$

Results

After achieving 95% test accuracy with the linear classifier, I attempted to evaluate the attack’s effectiveness. Such a simple strategy for manipulating the inputs yielded great results—I was definitely surprised. Here is how an image of a digit ‘7’ changes as we increase $\alpha$ (remember, $\alpha > 1$ is already enough for misclassification, though the higher the $\alpha$, the more confident the classifier will be of its incorrect decision):

Personally, I didn’t notice any changes until $\alpha = 5$ (more than enough for misclassification!). But if I hadn’t observed the original image, I wouldn’t even know that something is wrong. Only at around $\alpha = 50$ do we see some funny stuff (mostly in the background pixels), where one may suspect that the images are being manipulated.

The strategy of moving perpendicularly towards the decision boundary made sense, but it wasn’t at all obvious to me that the resulting poisoned image wouldn’t look drastically different. Of course, this whole “attack” assumed a lot of knowledge and power to change things on the attacker’s side, but it is still interesting why something like this could potentially work so well. It could be that my genuine surprise comes from looking at the 2D representations of the hyperplane in the diagrams above and interpreting movement in any direction as incredibly consequential. Indeed, some existing works speculate that the effectiveness of adversarial attacks is partly due to high dimensionality of the models (even a linear classifier for MNIST has $785$ parameters), but, so far, I haven’t found truly satisfactory explanations. I’ll have to keep looking, but please let me know if you are aware of any relevant literature!

Bonus chicanery

Trip to Peak District

Dovydas Joksas — Mon, 12 Sep 2022 11:00 +0100

Over the weekend, I had an amazing time at Peak District National Park.

Not the best time to get caught in a thunderstorm.

Lots of sheep.

Lots of ferns. Which I love because they always give me ancient forest vibes.

I’ve been told there had been even more flowers in August, but this was still breathtaking.

Kinder Scout.

Kinder Scout. I’m obviously afraid to fall off.

PhD Thesis

Dovydas Joksas — Tue, 06 Sep 2022 11:55 +0100

Happy to say that my PhD thesis—“Memristive Crossbars as Hardware Accelerators: Modelling, Design and New Uses”—has been made available online by UCL library services. You can find it (including full-text PDF) here.

As previously discussed in some of the blog posts¹, my research focused on using memristive crossbar arrays to accelerate matrix operations, which are heavily utilized in fields like machine learning. I investigated (1) the negative effects of using nonideal analog devices, (2) ways to improve the performance of these devices and systems that they comprise, and (3) new ways of using memristive crossbar arrays.

“The Power Crisis of Modern Computing”, “Analogue Solution to the Power-Hungry Field of Machine Learning”, “Memristive neural networks perform better when they work in teams”. ↩︎

Bulgakov & Lynch

Dovydas Joksas — Wed, 31 Aug 2022 10:15 +0300

At some point when reading The Master and Margarita, I realized “wow, this is so Lynchian!” Of course, due to fundamental nature of time, it’s probably the other way around. Regardless, I’m not the first to make this connection.

Is China Doxxing People Using Targeted Advertising?

Dovydas Joksas — Sat, 27 Aug 2022 17:30 +0000

A recent leak of a whistleblower complaint by Mudge suggests that Twitter is making a lot of money from Chinese entities and that some Twitter employees have concerns about it:

Twitter executives opted to allow Twitter to become more dependent upon revenue coming from Chinese entities even though the Twitter service is blocked in China. After Chinese entities paid money to Twitter, there were concerns within Twitter that the information the Chinese entities could receive would allow them to identify and learn sensitive information about Chinese users who successfully circumvented the block, and other users around the world. Twitter executives knew that accepting Chinese money risked endangering users in China (where employing VPNs or other circumvention technologies to access the platform is prohibited) and elsewhere.

Zach Edwards suggests in a Twitter thread how that may be possible. Long story short: custom audiences for Twitter advertising can be built using email addresses and mobile IDs, which—through some clever tricks of how ad campaigns are set up—would allow to relate those details (which Chinese entities might already have) to sensitive information, like location (which they might seek). At least, that’s how I understood it, you should check out that thread.

Abstractions in JAX

Dovydas Joksas — Thu, 11 Aug 2022 14:15 +0300

Originally posted on LinkedIn.

I’m considering making a transition from TensorFlow to JAX and, so far, am loving how effectively the latter exposes low-level behavior while still providing useful abstractions.

For example, the code snippet¹ below shows how one can perform gradient descent while utilising multiple devices:

the gradients are computed on multiple devices
they are synced across multiple devices and averaged
the new parameters are computed by adjusting them in a direction opposite to gradient

@functools.partial(jax.pmap, axis_name="num_devices")
def update(params, xs, ys, learning_rate=0.005):
    # 1. Compute the gradients on the given minibatch
    # (individually on each device).
    grads = jax.grad(loss_fn)(params, xs, ys)

    # 2. Combine the gradients across all devices
    # (by taking their mean).
    grads = jax.lax.pmean(grads, axis_name="num_devices")

    # 3. Each device performs its own update, but since we
    # start with the same params and synchronise gradients,
    # the params stay in sync.
    new_params = jax.tree_map(
        lambda param, g: param - g * learning_rate,
        params,
        grads,
    )

    return new_params

adapted from a tutorial by DeepMind’s Vladimir Mikulik and Roman Ring ↩︎

Updated Blog Structure—Again!

Dovydas Joksas — Wed, 06 Jul 2022 22:58 +0300

A year and a half ago, I split my blog posts into “essays” and “updates”. The former being mostly long-form writing and the latter—my personal updates and other minor stuff. I had a feeling for a long time, but now I’m almost certain—this was a bad idea!

Categorization can constrain thinking. And that’s how I’ve been feeling about my blog for some time. If I want to write a programming tutorial, it’s not really an essay, but it’s also not an update—it’s an original piece of writing. However, I don’t want to create a whole new section on my website dedicated to tutorials. Well, maybe, but it shouldn’t clutter the site. And that would be certainly happen with all the new categories I’d come up with.

There is no perfect solution to this, but I did following:

I got rid of major sections.
To make it easy to find relevant reads, I added a “category” to each post; you’ll find these at /blog. You might say that it’s the exact same thing as sections, but I really view this taxonomy as being soft, something I’ll be unafraid to change. I say “unafraid” because I removed section names from URLs, so any category changes from now on will not require tedious refactoring.

Additionally, I made these changes:

Removed dates from URLs. I hope most of my blog posts will make sense even after some time has passed.
Removed all RSS feeds except the main one. Sorry to all those who subscribed to feeds of specific tags! I just feel these feeds are redundant because I already include tag information in the elements, so you should be able to filter the posts in your RSS reader (if it’s any good…). For now, I redirect all RSS feeds to the main one. If that’s annoying, feel free to unsubscribe.
Removed trailing slashes from URLs. Looks better!

I took great care to not break anything (even wrote a few Go functions!), but if you notice anything strange, please let me know!

Future Computing Systems

Dovydas Joksas — Fri, 01 Jul 2022 16:55 +0300

There are concerns that Moore’s law—the trend of exponential increase in transistor density—no longer holds true. However, a number of new technologies may allow to keep improving computers and even give them new capabilities.

We explore this in our perspective article titled “Memristive, Spintronic, and 2D-Materials-Based Devices to Improve and Complement Computing Hardware”, which has been published today in Advanced Intelligent Systems [1]. You can access it for free here.

Great team effort:

Reference

D. Joksas, A. AlMutairi, O. Lee, M. Cubukcu, A. Lombardo, H. Kurebayashi, A. Kenyon, and A. Mehonic, Memristive, spintronic, and 2D-materials-based devices to improve and complement computing hardware, Advanced Intelligent Systems, vol. 4, no. 8, p. 2200068, 2022. doi:10.1002/aisy.202200068

Viva

Dovydas Joksas — Thu, 30 Jun 2022 11:30 +0300

Excited to share that I successfully passed my PhD viva two days ago! The process took almost four hours! I believe this is quite long for a UK PhD defense, but we went through the thesis chapter by chapter and explored all the main research questions and ideas, so I’m very happy with how it went. I got minor corrections, which should mostly entail (1) providing additional context in the beginning of the thesis for the motivation of the whole work, and (2) thinking how to better fit the chapter on new non-machine-learning applications of crossbar arrays with the rest of the thesis (which is mostly about machine learning) or whether to include it at all.

I want to thank the examiners—Eleni Vasilaki of the University of Sheffield and Miguel Rio of University College London (UCL)—for carefully going through my work and for asking great questions during the viva. I am also incredibly grateful to my amazing supervisor Adnan Mehonic, the whole memristor group at UCL, my collaborators, the colleagues and staff at our department, and, of course, my friends and family!

It’s a great conclusion to the four years of research work, but I don’t think it changes much of anything. I’m just happy that academic spam emails I receive will now be more accurate.

Loading Remote Bibliography with Biber

Dovydas Joksas — Sun, 22 May 2022 09:45 +0000

I have tens of LaTeX projects, and I’ve always found it annoying that I have to duplicate bibliographic entries in case I use another machine where specifying absolute path to a master BIB file wouldn’t work. I’ve always wanted to have a remote BIB file that could be automatically retrieved when compiling a LaTeX project. I didn’t realize Biber had this capability out of the box!

So now I simply have the following in my TeX preamble:

\usepackage[backend=biber]{biblatex}
% URL of plain text version of my master BIB file on GitHub (replace "github" with "raw.githubusercontent").
\addbibresource[location=remote]{https://raw.githubusercontent.com/joksas/latex-bibliography/master/phd.bib}

After which, I do

pdflatex main
biber main
pdflatex main
pdflatex main

Smarter Training of Memristive Neural Networks

Dovydas Joksas — Thu, 05 May 2022 11:45 +0000

Neural networks consume massive amounts of power, and memristive implementations may offer a solution. But although memristors are much less power-hungry, they are also stochastic and—like most analog devices—less precise. How do we deal with that?

We have been collaborating with Imperial to make memristive neural networks

adapt to nonidealities
consume even less energy
be robust to uncertainty

As illustrated below, we did this by

redefining neural network node functions so that they take into account potential nonlinearity and stochasticity
rethinking how neural network weights are implemented using memristor conductances, so that regularization could act as a way of further tuning power consumption
computing validation error multiple times at checkpoints to take stochastic nature of memristors into account

The resulting paper just came out in Advanced Science, while the code can be found here.

I want to thank my coauthors for all their contributions:

Telegram is Not “Encrypted”

Dovydas Joksas — Sat, 26 Feb 2022 11:20 +0000

I feel like repeating myself when discussing encrypted communication apps but, with Russia’s invasion of Ukraine, this does need repeating. As of 26 February, Telegram is the most downloaded social networking app in Ukraine and is incredibly popular in Eastern Europe in general. If you don’t want Russian government to find out who you are communicating with or what the contents of your messages are, it’s dangerous to use Telegram. I’m tired of newspapers still referring to this app as “encrypted”. It’s mostly not, at least not in the way that matters.

end-to-end encryption (E2EE) is not enabled by default in individual (one-to-one) chats
E2EE is not available in group chats
E2EE is not available in channels

Sure, Telegram communications are encrypted between the users and the servers. This kind of encryption is used by all modern websites, including this one, so this a really low bar. Because Telegram mostly doesn’t use end-to-end encryption, it holds massive amounts of unencrypted (or easily decryptable) user data and communications on those servers.

In fairness to Telegram, its founder has stood up to Putin in the past. However, this is more of a technological problem than a moral or political one. If servers hold sensitive data, governments could, in theory, access them either through court orders (of questionable legality) or through explicitly illegal means, like hacking. A better way is to simply not store such data on the servers in the first place.

Alternatives¹ to Telegram include Signal and WhatsApp. I have criticized the latter for its closed-source nature in the past, but, frankly, I believe it’s still better than Telegram in terms of security. Of course, Telegram has popular features like channels and there might not be great alternatives to that at the moment². But that doesn’t mean you should use this app for all communications, especially ones containing sensitive information.

There are also apps based on XMPP and Matrix protocols but, due to a number of social factors, I don’t think they can gain wide adoption. ↩︎
Let me know if there are! ↩︎

Who's Watching You?

Dovydas Joksas — Mon, 07 Feb 2022 12:10 +0000

Once you are told about Hikvision, you notice it everywhere in the UK. Cameras made by Chinese state-owned companies are used by most public bodies, including schools, universities, local authorities, and even a significant fraction of police forces. Some cameras have disturbing capabilities, including gender and age detection, and yet suffer from security vulnerabilities.

Big Brother Watch has just published an important report on this, detailing the findings from 4500 Freedom of Information requests and describing the capabilities of modern cameras by Hikvision and Dahua. I am happy to have been able to contribute in a small way.

You can read the full report here.

Open Source: Trust and Money

Dovydas Joksas — Sat, 13 Nov 2021 11:45 +0000

Signal recently announced that they will move away from a fully open-source model. To fight spam, a fraction of the server-side code will become closed. Some people became worried—“WHAT ARE THEY HIDING?”—but, honestly, I couldn’t care less if server side is “open source”. It actually made me realize that we are finally living in an age where you can run fully open-source software on clients’ devices, collect zero personal data and still make money.

Trusting the Server

Here’s how the Internet¹ works. If you type in yoshke.org/contact into your browser’s address bar, a request will be made to the server (that I control) responsible for yoshke.org domain name. Ideally, all you’d want this server to do is send back an HTML document containing my contact details. But you have no guarantee how and what the server will do. I could be

generating the page each time or simply serving a static file
sending different contact details to different people
logging your IP address and selling the information to the Chinese Communist Party

The first one shouldn’t matter to you, the second one is really annoying, and the third one—well, it depends… My point is, if you don’t know what’s happening on my server, how can you trust me with your data? After all, this website doesn’t even have a privacy policy—OH NO!

The good thing is that you don’t have to trust me. This website uses zero cookies²—you can confirm it in your browser. Thus, the only information you’re sending me is your IP address³ which alone can’t be used to personally identify you.

What about Signal? It is a messaging platform so all it deals with is private information! Fortunately, it is end-to-end encrypted. If Alice wants to send a message to Bob, it will get encrypted before leaving her device. The server will receive a series of seemingly random characters but even if they decide to store it or send it to someone other than Bob, these characters will be worthless to them. That’s because only Bob has the key capable of decrypting them into the message that Alice sent him.

So all we should care about is that Signal keeps their client-side app fully open. Unless you wish to run your own Signal server, it doesn’t matter what fraction of the server-side code they publish on GitHub—we couldn’t verify that that’s what they’re running anyway. As long as we (or someone who understands cryptography) can verify that the messages are properly encrypted, we can feel safe about our communications.

Zero-Knowledge Business Model

Unfortunately, most of the other apps utilize personal data in some way. Even if they say they don’t you can’t be sure because the majority are not revealing the code that is running on your device. And it makes sense—businesses have little incentive to open-source their apps.

But there are examples of community- or even business-driven applications whose source code is available to the end user. A major problem, though, is that most people possess multiple devices. If you install a note-taking app on your phone, you’d probably want to access those notes on your desktop computer. How do you do that while keeping them private? Open-source evangelists will probably recommend you some program which you can host on your server and which will sync the data between your devices. The issue is… normal people don’t own servers.

Although the end-to-end encryption model is usually synonymous with communication apps, I believe there is a huge market in other segments as well. There is no reason why your notes, calendars or news feeds which you subscribe to shouldn’t be encrypted by default—if all the server does is synchronize data between devices, it doesn’t need to know what the contents are⁴. It also solves the financial incentive problem⁵—everything that runs on your devices is open source, yet the entrepreneurs can still make money by offering a service that syncs your data. Importantly, you don’t have to trust them because all they’ll know is where the encrypted data are coming from and where they should be sent to.

I am actually surprised this isn’t a more popular business model. In a world where people are supposedly becoming more conscious about their privacy online, only fake solutions like VPNs⁶ are starting to gain traction. I hope it’s just a marketing problem because trustless zero-knowledge apps seem to be one of the very few ways to ensure real digital privacy.

As a side note, only zoomers and the NYT choose not to capitalize this word. ↩︎
I also don’t use embedded social media buttons or YouTube video players which often enable these third parties to track you. ↩︎
This is necessary so that I know where the requested information should be sent back to. ↩︎
Etebase looks like an amazing framework for this kind of model. ↩︎
At one point in his life, Travis Oliphant, the creator of NumPy, was worried about the amount of time he was spending on writing open-source software—how does one make money, afford to have kids, etc.? So he asked Richard Stallman and here’s what Stallman had to say: “Well, you know, I think just be like me and don’t have kids.” I can’t express in words how stupid this advice is. It’s ridiculous to abandon plans of starting a family just to develop software which is in line with your ethics. ↩︎
All that VPNs are good for is torrenting and accessing US newspapers. ↩︎

Genetic Privacy

Dovydas Joksas — Sun, 03 Oct 2021 07:20 +0000

This Veritasium video gives a great overview of how genetic information can be used to solve criminal cases nowadays. The idea is that the criminal’s DNA doesn’t have to be on anyone’s database—genetic information from the relatives (e.g. from genetic testing companies) can be partially matched to the DNA found at the crime scene. This allows to narrow down possible suspects to very few individuals.

I cannot imagine how important this is to the victims’ families, but the increasing availability of genetic information scares me a lot too. Not only the fact that it could be abused by governments or insurance companies (as mentioned in the video), but also the fact that catastrophic data breaches do happen. Even if you’ve never used the services of genetic testing companies, a fuzzy signature of your DNA is already stored on someone’s servers. Your DNA is not your password or your credit card—if it gets stolen, you can’t just change it. Today, the damage would probably be limited. But I’m worried how tomorrow’s technology will change the calculus.

Amazon Alternative for Books

Dovydas Joksas — Mon, 16 Aug 2021 12:50 +0000

Amazon originally started as an online marketplace for books. It has since expanded to all sectors of online retail, internet infrastructure, physical stores and services. Even so, it remains the dominant player in the bookselling business.

Not feeling comfortable using Amazon anymore, many people are looking for alternatives. But that’s especially complicated when it comes to books. Amazon is great at it—they have all the books one might be looking for, their prices are probably the best, and the review system helps you find out about the print (and sometimes content) quality before purchasing a book. But I think I found¹ a good enough alternative that I’ve been using for the past five months—bookshop.org (or if you, like me, are in the UK—uk.bookshop.org).

You can find most books on Bookshop, albeit at a slightly higher price than at Amazon. But the main reason I’m using this new service is because they support local bookshops. With every order, you’ll be contributing to an earnings pool that is evenly distributed among independent bookshops. Alternatively, you can pick a specific store you want to support.

I must mention that not all booksellers are happy about this. A New Statesman article says

Bookshops earn less through sales on Bookshop.org than they would from selling their books direct to customers, and booksellers fear the site, rather than competing with Amazon, is diverting shoppers away from the high street.

Sure, if you are looking for Obama’s newest memoir, you are going to find it in most bookshops—that’s where you should buy it. But for those looking for less popular books or for those who would buy online anyway, Bookshop seems like a good option.

thanks to The Realignment Podcast (which you should check out!) ↩︎

RSS Screwup

Dovydas Joksas — Wed, 14 Jul 2021 09:35 +0000

I was decanonifying (is that a word?) my URLs and, as a result, accidentally altered GUIDs of some RSS items. This might have lead some feed readers to misidentify a number of old posts as new ones. Sorry! Should be fixed now.

Your Medical Data

Dovydas Joksas — Fri, 11 Jun 2021 13:20 +0000

NHS is planning to store UK patients’ medical histories in a centralized database with access provided to “academic and commercial third parties for research and planning purposes”. The government is justifying the move by claiming that it will save lives and the data¹ of patients will be replaced with unique codes anyway, so there is supposedly no risk of compromising privacy. However, the NHS will be able to convert the codes back to data to identify the patients “in certain circumstances, and where there is a valid legal reason”.

Your data are not secure. If the NHS can identify you, so can anyone who compromises their centralized database. Your data at your local GP probably aren’t secure either, but it’s less likely that they will be targeted—orders of magnitude smaller reward simply isn’t worth it to most hackers. But even if this database doesn’t get hacked, the government doesn’t exactly inspire confidence so I’m not enthusiastic about entrusting it with even more power.

These concerns are shared by a lot of people, and so, after backlash, the NHS has decided to delay the creation of the database by two months (from July to September of this year). This is important because you now have time (if you wish) to opt out and prevent “your GP data leaving your GP practice for purposes other than your direct care”. You can do this by following the steps HERE.

NHS number, General Practice Local Patient Number, full postcode and date of birth. ↩︎

Why Doesn't Google Maps Use Its Own Subdomain?

Dovydas Joksas — Tue, 01 Jun 2021 20:50 +0000

I’ve recently moved to a new place and realized that I need a haircut. So I opened Google Maps on my laptop with the hope of finding a decent barbershop nearby. Being a bit lazy, I just pressed the button that would center the map around my location automatically. A message appeared asking me to change location permissions in the site settings. I’m not enthusiastic about providing even more data to Google, but, I figured, whatever, Maps can have my location.

When I opened the site settings, I was a bit surprised because all the permission changes would apply not only to Google Maps, but also to Google Search which uses www.google.com domain name. I was surprised because I had been under the impression that Google Maps uses maps.google.com domain name; had this been the case, it would have allowed me to apply separate site settings to Google Maps only¹. But then I checked the URL, and nope—it’s www.google.com/maps instead.

Why did I think that Google Maps uses its own subdomain? Because every other major Google service does!

Gmail: mail.google.com
Google Duo: duo.google.com
Google Cloud: cloud.google.com
Google Fonts: fonts.google.com
Google Photos: photos.google.com
Google Play: play.google.com
Google Voice: voice.google.com
Google {Docs, Sheets, Slides, Forms}: docs.google.com

Now, you can enter maps.google.com into the address bar, but it will simply redirect you to www.google.com/maps. So why does Google make an exception for Maps? I don’t know for sure but I am hypothesizing that the answer lies in how I uncovered this in the first place. It could be the case that the best way for Google Search to gain access to users’ location in browsers is through Maps where providing such access seems reasonable. This information might be so valuable that it warrants Google Maps not utilizing its own subdomain.

In the end, I didn’t provide the access, and simply entered my postal code. There is a barbershop a few blocks away with a perfect 5.0 star rating (with a large enough sample size), so I guess I’ll try them!

Update #1 (June 2, 2021): In an exchange on Hacker News, I was told that Google Maps had utilized maps.google.com before, but switched to www.google.com/maps about two years ago. So this does seem to be a calculated move.

I think this is how all Chromium-based browsers handle permissions, i.e. different settings can be applied to subdomainA.domain.tld and subdomainB.domain.tld. ↩︎

Don't Teach Statistics in High School

Dovydas Joksas — Thu, 20 May 2021 16:35 +0000

In recent years, there have been more and more calls to prioritize statistics in high-school math curricula. The rationale is that unlike, say, calculus, statistics is supposedly much more applicable in everyday life. Sounds nice, but what does that really mean? Computing averages? Hell yeah, that’s useful¹! But most of the things that are taught in a typical (AP, A-level, IB) high-school stats course are not only of little practical value, but also—and more importantly—misleading. Statistics is one of very few subjects where studying it for a short time leads to poorer intuition about the real world than not having studied it at all.

It’s Real Hard

Analyzing noisy data and making decisions based on that is incredibly difficult, but we have somehow convinced ourselves that this is something any 17-year-old should be able to do with the help of a few simplistic tools. It’s delusional and it should be obvious by now. Even most of the people whose job is to utilize statistics every day don’t really know what they are doing.

In 2002, academics and students from psychology departments of several German universities were asked to fill out a questionnaire [1]. It consisted of 6 statements about the concepts behind statistical significance in hypothesis testing; the respondents had to mark each statement as either true or false. The bar chart below shows what proportion of each group marked all 6 statements correctly:

The fact that all students who had taken statistics courses made at least one mistake is not as scary as the fact that so did most of the instructors who teach these courses. But upon reflection, this isn’t surprising. Statistics courses in high schools and universities aren’t meant to develop understanding. They are all about “useful” procedures that make little sense but are supposed to make data analysis more rigorous². One of such procedures has been half-jokingly dubbed the “null ritual” by Gerd Gigerenzer [2]:

Set up a statistical null hypothesis of “no mean difference” or “zero correlation.” Don’t specify the predictions of your research hypothesis or of any alternative substantive hypotheses.

Use 5% as a convention for rejecting the null. If significant, accept your research hypothesis. Report the result as p < 0.05, p < 0.01, or p < 0.001 (whichever comes next to the obtained p-value).

Always perform this procedure.

Better Than Nothing?

Misusing statistical methods is worse than not using them at all. They have become a way of presenting results based on small sample sizes as solid evidence. We know this happens all the time because a large number of studies cannot be replicated; in fact, many more than one would expect given the assumptions about randomness in them. In 2018, an effort was made to reproduce 28 famous psychology experiments [3]. Only half yielded significant results when repeated with large sample sizes.

That’s terrible. It’s not just that these studies are wrong. Other scientists build on top of that, using them to explain their own questionable findings. Pretty soon you have people tweeting “New study shows that…” or, worse, politicians introducing new legislation because TRUST THE SCIENCE™.

Things to Teach

The fact that people who take university-level statistics courses still misuse it, should make it obvious that this is not something we should be teaching to high-school students. Sadly, that won’t change the minds of new-wave educators who are all about making math “more useful in the real world”. But teaching students about instantiations of concepts, rather than their abstractions is simply a bad strategy. I almost never see students who understand the theory well struggling to apply it to a specific problem in the real world. But I observe the opposite all the time—students learn some algorithm (“trick”, “hack”), they know how to use it to solve a particular problem, but when they are asked to explain it or to apply it in a different context, they almost always fail. Statistics education in high schools and universities is the most unfortunate example of this.

I believe math education should be all about enabling students to think abstractly. But if I had to compromise and develop a syllabus for high-school statistics, it would be very different from what we have now. It wouldn’t instruct students to use hypothesis testing or correlation coefficients because these can be easily misused if not understood properly. Instead, it would be a course about being skeptical, about always questioning conclusions based on data:

Could this be explained by randomness?
Was this discovered by testing a hypothesis or looking for a pattern in a large data set [4]?
Even if observed effects are significant, could they be explained by some other variable³?

Knowing how to explore these is not enough to perform statistical analyses, but asking such questions will certainly make students more confident while navigating the world where dubious claims are being made all the time. And for those who wish to use statistics the right way, a long road lies ahead, with lots of concepts to be mastered first. It will take time. But that’s OK.

References

H. Haller and S. Krauss, Misinterpretations of significance: A problem students share with their teachers, Methods of Psychological Research, vol. 7, no. 1, pp. 1–20, 2002.
G. Gigerenzer, Mindless statistics, The Journal of Socio-Economics, vol. 33, no. 5, pp. 587–606, 2004. doi:10.1016/j.socec.2004.09.033
R. Klein, M. Vianello, F. Hasselman, B. Adams, R. Adams Jr, S. Alper, M. Aveyard, J. Axt, M. Babalola, Š. Bahnı́k, et al., Many labs 2: Investigating variation in replicability across samples and settings, Advances in Methods and Practices in Psychological Science, vol. 1, no. 4, pp. 443–490, 2018. doi:10.1177/2515245918810225
G. Smith and S. Ebrahim, Data dredging, bias, or confounding: They can all get you into the BMJ and the friday papers, BMJ, vol. 325, no. 7378, pp. 1437–1438, 2002. doi:10.1136/bmj.325.7378.1437

It’s also something that students already know how to do by the time they reach high school anyway. ↩︎
I, too, used these procedures in my undergrad years, thinking that this is what science is all about. ↩︎
“Oh boy, all these drownings seem to be really driving the ice cream sales!” ↩︎

Undoing Robert Maxwell

Dovydas Joksas — Tue, 04 May 2021 09:35 +0000

“The Big Short” opens with the following lines:

In the late seventies, banking was not a job you went into to make large sums of money. It was a fucking snooze. <…> And if banking was boring then the bond department at a bank was downright comatose. <…> That is, until Lewis Ranieri came on the scene at Solomon Brothers…

What Lewis Ranieri did to banking by introducing mortgage-backed securities, Robert Maxwell had done to scientific publishing by exploiting the broken incentive structure of that industry.

The Sundae

Scientific knowledge is disseminated through academic publishers: scientists perform research, they submit their findings to journals, and then others scientists read about those findings, usually through their university library which purchases access to the periodicals. The bottleneck of this whole process, at least in the UK, used to be trying to have your papers be even considered by the journals. These were usually created by scientific societies that were slow and focused almost exclusively on works published for their own members [1].

But in the early 1950s, there comes this guy—Robert Maxwell—whose tactics are much more aggressive. Having recently become the majority shareholder of Pergamon Press, he takes a very proactive approach to scientific publishing. He goes to conferences, invites researchers to parties in villas or on private jets¹ with one goal in mind—to convince individual scientists that their field needs a new journal and to recruit them as editors [2].

And just like that, Robert Maxwell began quickly turning Pergamon into a household name. By 1960, it was publishing 59 journals, and more were added every year [1]. It seemed that it was a win not only for the publishers, but for the scientists too—more of them were now able to share their findings with the world. And although the exponential growth of this industry was worrying to some, most believed that that’s what the market was demanding, thus the move towards a new economic equilibrium seemed only natural.

The problem was that the industry wasn’t operating under the free-market principles. Maxwell’s genius was to recognize that. As new journals were emerging, scientists wanted to keep up with all this knowledge, and libraries were willing to satisfy that need. Universities began subscribing to every new journal that Pergamon or other major publishers created. With government funding for research only increasing throughout the sixties and seventies [2], there was no incentive for Maxwell (and others who mimicked him) to slow down. And so the publishers became the kingmakers of science.

Seafood Stew

Half a century later, we are at a point where each of the major academic publishers controls thousands of journals, and most of the following is usually true:

Papers submitted to journals undergo a process called peer review where other scientists judge the novelty and quality of the work, and help determine whether it should be published. They don’t get paid.
The authors sign away their copyright. It is not uncommon for the journals to demand hundreds of dollars for figures that the authors themselves produced if they want to reuse them in, say, review articles.
Governments fund scientific research in universities, scientists submit their work to journals without getting paid, and then the governments (indirectly) pay the publishers for scientists’ access to those journals.
The public (which pays for all of this) doesn’t have access to most of the published papers.

The last point has especially troubled governments for decades. But academic publishers—being the crafty and morally onerous supporters of science that they are—have finally embraced a solution proposed by activists: Open Access. For a small fee (of up to $11,200 per article) many journals now offer the authors to make their work open to the public.

Publishers want everyone to believe that they provide great value to the scientific community and that they are already being generous, but the data just don’t support that. In 2019, Nature Springer had an operating profit margin of 23%, Wiley—of 27%, and Elsevier (which had absorbed Pergamon in the nineties)—of 37%. And that’s not surprising—most of the work for them is done for free. Their costs are mainly the salaries of the editors and the people who format the accepted papers²; hosting the papers on the website is unlikely to be expensive³. Publishers are just middlemen.

The B-Word

I was wondering how to remove the need for publishers altogether, and I’m slightly embarrassed by the first thing that popped into my mind. Blockchain. I roll my eyes 90% of the time when I hear this word because it almost always is a solution in search of a problem. However, in this specific case, it could actually solve a real problem. I was reassured after I had found out that others too are taking this idea seriously.

Blockchain approach would allow scientists to share and assess each other’s work without middlemen. More specifically, it could facilitate

peer review process that ensures
- fair selection of reviewers
- accountability of reviewers
- if necessary, anonymity of authors and/or reviewers
decentralized distribution of knowledge immune to
- arbitrary changes to terms of service
- influence from the mobs⁴

The technology to replace academic publishers is clearly here, but that’s not enough. Different journals have become (non-perfect) proxies for the quality of the research—if the author has the freedom to choose to submit their paper to either Nature or some new crypto platform used only by libertarian weirdos, they will pick the former every time. Sure, the equivalents of journals with different impact factors and prestige attached to them could naturally arise in the blockchain implementation, but not until their widespread adoption.

I love free-market solutions, but they may not be sufficient to convince the scientific community to abandon traditional academic publishers. The whole industry is just too heavily distorted by government influence. Obviously, I’m angry at the publishers but, really, I shouldn’t be. Robert Maxwell did what he was supposed to do—maximize profits. Similarly, scientists don’t have enough skin in the game, or more precisely, there aren’t enough incentives for them to consider the public interest. Both of these phenomena are emergent from the way science funding works and only governments can fix it, unfortunately. And although the move towards Open Access didn’t address the oligopoly of academic publishing, it at least showed that large-scale changes are possible.

References

B. Cox, The Pergamon phenomenon 1951–1991: Robert Maxwell and scientific publishing, Learned Publishing, vol. 15, no. 4, pp. 273–278, 2002.
S. Buranyi, Is the staggeringly profitable business of scientific publishing bad for science?, The Guardian, 2017. [Online]. Available: https://www.theguardian.com/science/2017/jun/27/profitable-business-scientific-publishing-bad-for-science [Accessed: May. 2, 2021].

I wonder if this was an inspiration to his daugther’s partner who also loved to hang out with scientists. ↩︎
By the way, the formatting is often terrible. Using a simple LaTeX template would do a better job most of the time. ↩︎
See page 9 of arXiv’s 2020 annual report. ↩︎
Just to be clear, I am not saying that this paper was very rigorous. It’s just that if we apply such standards selectively, a bias will be introduced favoring certain types of conclusions. ↩︎

Spotify Thinks I Like German Music

Dovydas Joksas — Wed, 28 Apr 2021 18:25 +0000

Spotify collects so much behavioral data, yet I’m amazed at how poorly its AI is utilizing them. Each Discover Weekly playlist of mine for the last two years has included at least one German song. I can hide these songs, I can skip them, and yet nothing has an effect on the recommendations. I’ve already lost count of how many times I’ve been recommended a 20-minute song about the Autobahn. Autobahn? Really?

The Strangest Thing About the Bitcoin White Paper

Dovydas Joksas — Wed, 07 Apr 2021 13:50 +0000

A few weeks ago, I realized that the net worth of Satoshi Nakamoto, the anonymous creator of Bitcoin, makes him¹ the fifth richest person on planet Earth. I started reading about the predictions of who he is, various analyzes of his behavior before the disappearance, etc. It eventually led me to the beginning of it all—the Bitcoin white paper where, for the first time, Satoshi laid out the design principles behind the Bitcoin protocol.

I wasn’t surprised to find that the paper was written in $\LaTeX$. After all, that’s what most of technically-minded people use when creating complex documents. However, one thing that stood out to me was its quality. It wasn’t great. Here are a few typographical errors that I spotted right away:

Some of the variables or calculations were not typeset in math mode.
Contents inside math mode were often over-italicized, e.g. the word “if” is written in italics, even though the convention would be to make it upright.
Satoshi used " to enclose text in quotation marks, even though the preferred way is to use `` (two backticks) for opening quotation marks and '' (two single quotation marks) for closing quotation marks².

I know what you are thinking: “YOU NERD, WHO THE HELL CARES ABOUT THE TYPESETTING CONVENTIONS OF QUOTATION MARKS?!” Sure, if you prefer Microsoft W*rd, that’s not something you’ll ever consider, but people who use $\LaTeX$, on average, pay more attention to such things. The first documents I wrote in $\LaTeX$ weren’t great, and I’m learning new things up to this day. But Satoshi looked like someone a lot more experienced so these mistakes seemed out of place.

And so I DuckDuckWent³ “satoshi nakamoto bad latex” or something like that. It was at that moment in time that I realized—I am an idiot. I found this discussion on StackExchange which essentially showed that the Bitcoin white paper was not written in $\LaTeX$. I went back to the paper and became convinced of this myself. I had been fooled and I felt terrible about it.

But why was I fooled? Well, the white paper does mimic $\LaTeX$:

The margins are huge which is a feature of the default templates.
The font used is very similar to Computer Modern (the default font in $\TeX$ documents).
Centered title and numbered sections in bold.
Author name followed by an abstract in the center of the page with even larger margins than the rest of the document.

The totality of these observations made me not even question that this is a $\LaTeX$ document when I first opened it. But once the truth is revealed to you, it becomes obvious. The font is not really Computer Modern, the spacing is a little weird in some places, where math mode is used the equations still look really ugly.

The discussion on StackExchange provides some hypotheses as to how the document was actually produced. File metadata suggest that it was created using OpenOffice Writer. Of course, one could fake metadata, but the paper being written in W*rd or Writer would explain a lot—ugly typography, weird spacing, the style of the diagrams, etc. The thing that bothers me is this—why even try mimicking $\LaTeX$?

The fact that the document looks the way it does is no coincidence and there must have been a deliberate effort to make it look like that. Did Satoshi have some kind of insecurity about how his writings looked that he went to extreme lengths to make them appear more academic? That seems silly—it would have been much simpler to just use $\LaTeX$ which is very easy to learn, especially for someone like Satoshi. The second possibility is that OpenOffice had some sort of plugin which would allow to effortlessly make any document look more $\LaTeX$-like. I don’t know what OpenOffice’s capabilities in 2008 were, but this seems unlikely. I can think of only one other possible scenario⁴; the only one that would make me happy.

What if the Bitcoin white paper was written in $\LaTeX$? At this point, you might already be thinking that this whole blog post is just me trolling bitcoiners, but hear me out. $\LaTeX$ is incredibly powerful and, in theory, you could even use it to make a document look ugly. That’s not an easy task—you would have to import ugly fonts, manually modify spacing of words and characters, and change the metadata to make it look like it was produced by OpenOffice—but it’s doable. Why would Satoshi do this though? The more I read about him, the less I understand him. But I wouldn’t be surprised if this was his attempt at humor.

Satoshi could be a man, a woman, or a group of individuals. Given the evidence, the first scenario seems the most likely one, so, for simplicity, I’ll make that assumption. ↩︎
I am referring to the way they should be typed in the source file, not how they will appear in the compiled file. ↩︎
This joke is so bad that I’m actually embarassed. ↩︎
Also alluded to in the StackExchange discussion. ↩︎

Updated Website

Dovydas Joksas — Sun, 28 Mar 2021 19:28 +0000

I’ve been coding (startup- and PhD-related stuff) almost every day for about half a year now, so this weekend I decided to procrastinate a little bit… by coding non-urgent things. Specifically, I thought my website needed some redesigning.

When I first made it last summer, I had had essentially zero experience of web development. I used Hugo website generator which made my life much easier. I could import a theme, change a few parameters, write a blog post in Markdown and voilà—I had a non-bloated static website ready to deploy to a server.

As I’ve learned more about how webpages work,¹ I wanted to tweak my website in various little ways. But I spent more time analyzing HTML and CSS definitions of the theme that I was using than actually improving the website. Also, I got tired from my website’s design. Too many people are using that same theme and, also, some stylistic choices didn’t make much sense to me.

Thus, on Friday, I thought I’d give a shot at redesigning the website. I still used Hugo, but instead of importing someone else’s theme, I decided to design HTML templates myself and use Bootstrap for base CSS². For design inspiration, I used the New York Review of Books website—I adopted one (red and yellow) of their many beautiful color schemes and I also redid blog post headers by mimicking some of their layouts.

Other changes made include:

I no longer show estimated reading time for any of the essays.
I no longer use Font Awesome icons. I realized I don’t need any icons at all!
I converted bitmap images in the blog posts to WebP format for smaller size. One extremely annoying thing is that a lot of websites still don’t support WebP for the Open Graph protocol (used for content previews), so I still have to use JPG or PNG for that.
I have a new 404 page!

If you find anything broken, please let me know!

Out of necessity, really, because I needed to build a dynamic web app for creating quiz questions for Ab Initio AI. ↩︎
People who say that Bootstrap is bloated are silly. You can import only the components that you need and, also, most browsers cache CSS so it’s loaded only once anyway. My customized Bootstrap CSS and JavaScript (latter for toggling menu button on small screens) take up less than 100 KB. ↩︎

Variances in Memristors: Mitigation and Exploitation (Seminar)

Dovydas Joksas — Tue, 16 Feb 2021 07:25 +0000

On the 26th of February, I will be giving a talk “Memristive Neural Networks Work Better in Teams” at a research seminar hosted by the Electronic and Electrical Engineering Department of UCL and the Agrate Unit of the Institute for Microelectronics and Microsystems (Italian National Research Council). You can register here!

Great Scott, They Predicted... the '80s!

Dovydas Joksas — Thu, 11 Feb 2021 14:35 +0000

I’ve been rewatching Back to the Future trilogy (for the nth time) over the last few days. One thing that stood out to me this time was the Cafe 80’s from Part II. It was an attempt at depicting the 1980s as perceived by people from 2015 (remember, Part II was released in 1989).

I was surprised by how accurate that portrayal was. It was really the things many of us tend to think about when that period of time is mentioned nowadays: VHS cameras, Michael Jackson, the colors, arcade games.

I don’t think we’d be able to depict the 2010s that well. Sure, one could say that that’s a good thing i.e. ‘culture is not so monolithic anymore’. But I fear we’re just dull—can you think of a single thing from the last decade that you’re sure would be fondly remembered 30 years from now?

Contact Tracing

Dovydas Joksas — Tue, 12 Jan 2021 15:57 +0000

There is a great article on MIT Technology Review about Singapore’s contact tracing fiasco. It is amazing to me how much trust some people put into government apps. This was supposed to happen.

Zuccing Reimagined

Dovydas Joksas — Fri, 08 Jan 2021 09:43 +0000

Half a year ago, I explored whether we can be sure if WhatsApp messages are actually end-to-end encrypted. We can’t, but I am more inclined to believe them now than I was back then. Not because I think they are the good guys, but because they are going to extreme lengths to monetize the app in other ways while maintaining that the messages are encrypted.

Yesterday, a pop-up message appeared asking me to agree to the new WhatsApp terms of service. If you actually read the new privacy policy, you find that not only will they be collecting information about you, your device and how you use the app, but they will also be able to share that information with its parent company, Facebook. Those who don’t agree to the new terms won’t be allowed to use WhatsApp after February 8.

Privacy and targeted-ads-based revenue models are incompatible. Anyone who, after this, still thinks that WhatsApp prioritizes privacy is a fool. Having encrypted messages while giving all other pieces of personal information away is like putting on some sunscreen before running into a burning building.

Happy New Year!

Dovydas Joksas — Fri, 01 Jan 2021 08:56 +0000

Restoration

Dovydas Joksas — Tue, 29 Dec 2020 14:06 +0000

A few months ago I discovered an amazing YouTube channel—Baumgartner Restoration. It is maintained by a guy who restores old paintings for a living. I had no idea how many different stages this process usually involves; the level of mastery required makes these videos extremely satisfying to watch. This video is a great example of how many techniques may need to be employed to fix a painting that suffered an extensive amount of damage.

The Sopranos ∩ The Godfather ≠ ∅

Dovydas Joksas — Sat, 26 Dec 2020 08:16 +0000

One of the things I find strange about ‘The Sopranos’ is the large number of (often explicit) references to gangster movies. ‘Goodfellas’ is mentioned a few times, despite the fact that 27 actors from that movie star in ‘The Sopranos’. References to ‘The Godfather’ trilogy are even more abundant, so I figured it would not share any actors with the TV series. I was wrong—in a big way! For example, Dominic Chianese’s character Junior Soprano makes Chinese Godfather joke, even though Chianese himself played Johnny Ola in ‘The Godfather Part II’. This somewhat takes away from the magic of ‘The Sopranos’, unless the whole series was just Tony’s dream (which I hope it wasn’t!).

GitHub Surprise

Dovydas Joksas — Sun, 20 Dec 2020 12:59 +0000

Am I the only one who didn’t know that GitHub Pro is free for students? I had no need for Pro features until now, but still…

Personal Library: How?!

Dovydas Joksas — Fri, 04 Dec 2020 08:20 +0000

As if there isn’t enough work to do, I decided to organize my personal library. Thanks to my OCD, I have zero tolerance for chaos in this world, and yet my (and my family’s) collection of books has been just that. After some rigorous research¹ on how to organize books—with some of the options being “alphabetically”, “chronologically, “by color” (WHAT??)—I decided to go with Universal Decimal Classification which arranges materials by subject (each of whose contents I would sort alphabetically). However, I realized that if I ever change my mind on the system that should be used, I would have to start over from scratch. This made me focus on a more daunting task first—cataloging!

Before I even try to organize my books, I want to have a good understanding of what is in my possession. Essentially, I want to create a database for my books—each entry would contain the book’s title, author(s), current location (as I move between the UK and Lithuania) and any other useful details. Being the strong independent man that I am, I thought I should build it myself from the ground up. I know some SQL and PHP (both which I had to learn for my startup), so I figured that would be doable. But there came an instant realization that I might not be the only one using the database—my family might too. Given that user interface has never been my strong suit, I decided that this is a silly idea. Besides, I would be wasting a lot of time on this; someone must have figured it out already!

If anyone has had experience with cataloging their books using software and has any suggestions, I would really appreciate it. Here are my preferences:

Simple (functionality-wise). It’s at most a few hundred books we are talking about, not the Library of Congress.
Modifiable. I want to be able to add my own fields, etc.
Easy to use (high-level) interface, as well as a low-level alternative (e.g. a CLI).
Access by the Internet.
Privacy/independence. I want to run my personal instance of this database on my server, instead of letting some company handle it. By extension, I would want the solution to be open source.
Flexibility. The data should be stored in a popular format, so that I could easily convert them, if necessary.

There are solutions like Invenio which seem close to what I have in mind, but they have so many dependencies that I had trouble even trying to install them. Thus, before settling on a solution, I am really hoping to hear from someone who has had to deal with this kind of software before. Thank you.

googling ↩︎

Why Censuses Should Be (Slightly) Inaccurate

Dovydas Joksas — Fri, 20 Nov 2020 10:40 +0000

Since the founding of the country, the US government has been conducting a census every 10 years. The results of this survey are used to calculate the number of seats each state gets in the House of Representatives, to distribute federal funds to states, etc. [1] In addition to estimating the populations of states, the census aims to find out the number of people living in each household, as well as the age, sex, race and relationship status of each individual in that household [2]. Such data can be used by the government to, for example, monitor compliance with anti-discrimination laws, but also, if made accessible to the public, can be of great use to academics, NGOs, and others. Of course, given the sensitive nature of these data, we do not want to compromise the privacy of the individuals who are disclosing the information. So how much about the census can we reveal so that it provides useful insight into the population and, at the same time, keeps the individuals’ data confidential?

Aggregation

One of the most obvious ways to publish data about a population without revealing the identities of the individuals within it, is to aggregate those data. Saying that the median age of the American population is 38.2 years, tells you virtually nothing about how old Morgan Freeman or Laura Dern is. Unfortunately, that level of aggregation may not be particularly useful.

To understand local populations, more fine-grained data may be preferred, but with that comes higher risk of privacy loss. If aggregated data about very few people are published, there may be a way to reconstruct the characteristics of those individuals. We will imagine a scenario where an intern¹ at a city council decides to upload aggregated block-by-block age data (by demographic) to city’s public website without reviewing each block individually. We will explore data of one of these hypothetical blocks—one where 8 people live:

Some of you may notice right away how careless the release of such data is. We can already deduce that a 45-year-old white male lives in the block because there is only one white male living in the block according to the aggregated data, and we have age data for that group. As bad as it already is, there is a good chance that we might even be able to determine whether that guy is single or married, thus reconstructing all of his traits that were aggregated.

Reconstruction Attacks

In a recent paper, computer scientists at US Census Bureau laid out how bad actors may utilize the processing power of modern computers to infer something about the individuals from their summaries—a process known as data reconstruction [1]. In one of the more popular methods involving SAT² solvers, one could describe this task almost like an algebra problem. We can attempt to reconstruct summary tables by introducing a bunch of variables and then set up equations (constraints) which we would try to solve. For example, we might describe the statistics above using equations such as $$\frac{A_1 + A_2 + \cdots + A_8}{8} = 39$$ where $A_i$ is the age of person #$i$.

I applied data reconstruction principles described in [1] to our example. Modern SAT solvers allowed me to determine with 100% certainty that the 45-year-old white male is married. Not only that, I was able to reconstruct all characteristics of all 8 people with 100% certainty. That is, from the universe of all possible combinations of the eight individuals and their traits, there was only one that fit all the data in the table, and the solver found it:

The Best Defense

What is the best way to avoid such disastrous scenarios?

Suppression

One of the techniques to avoid compromising the privacy when publishing summaries about a small number of people is called cell suppression. For example, if a statistic is based on one or two people, a choice could be made to not publish it. When applied to our example, it would mean that we would have to suppress the data for single adults, black females, black males and white males. Of course, if we publish data for white females and females overall, we can easily deduce the number (2) and average age (32) of black females. Thus, an argument could be made to also suppress the data for white females. However, for the sake of simplicity and the fact that a lot of facts could be deduced this way (just using more steps), I will keep the data for white females in place. With aforementioned fields suppressed, there are now 9 possible configurations that could produce such statistics:

The fact that now there is more than one plausible set of people that would fit the summary table, at least in theory, ensures more privacy—the attacker will be less certain which set is the real one. However, if we investigate closely, we see that the situation has not improved by a lot. All 9 possibilities are very similar—mostly just the youngest and the oldest individuals have slightly different characteristics from set to set. This means that the attacker could determine the characteristics of some people with very high certainty. We could try suppressing even more data but that may be problematic because

ensuring high level of privacy (i.e. having many plausible combinations) severely limits how many pieces of data can be published [1],
even with relatively small data sets, it might be computationally infeasible to determine whether the data that are published could be used to identify any of the individuals [1],
there is no guarantee that the attacker will not have in their possession additional data (gathered from other sources) that, when combined with the redacted statistics, would compromise the privacy of everyone whose personal information was aggregated.

Noise

If we choose to publish more informative statistics, one thing we could do to preserve privacy is… to alter the data! If we add noise—say, a random integer between $-3$ and $3$—to your age, an attacker will be less likely to identify you because your age is not well defined anymore. But doesn’t this make the published statistics less accurate? Technically yes, but not by as much as one might expect. Remember that we are not publishing data about individuals, but rather about groups of them—if the noise is applied in a careful way, it will tend to cancel out, i.e. it is unlikely that the ages of all the residents in a particular block will be altered in the same direction. And the more aggregation we apply, the less negative impact the noise will have: city-level data will be more accurate than block-level data, state-level data will be more accurate than city-level data, and so on.

Even more effective is adding noise to the tabulated data directly, instead of the individual entries. Not only do we require significantly less noise in that case [3], but it is also much more difficult to reconstruct the database, i.e. there is a very large number of plausible combinations and they are much more diverse³:

For a long time, it was difficult to quantify the effect of noise when publishing statistical results, but in 2006 a framework, called differential privacy, was developed [4]. This system formalizes the treatment of trade-off between privacy loss and accuracy. That is, differential privacy shows to what extent a certain amount of noise increases privacy and decreases accuracy of published statistics, as well as how to inject the noise in the most efficient way. Importantly for censuses, this framework gives relative guarantees about the privacy loss resulting from public release of statistics [3]. Even if an attacker has additional information, such as consumer data from a credit bureau, using differential privacy in a careful way guarantees that publishing statistics will not increase the privacy loss of the people whose noisy data are included [3].

Given the power of differential privacy, it will, for the first time, be used in the 2020 US national census. Although noise was being injected in previous censuses (in a form of swapping, for example [1]), differential privacy will allow the Census Bureau to decide how much noise to inject to both 1) ensure sufficient privacy, and 2) make sure that statistics are accurate enough. Additionally, it makes it possible to prioritize—certain characteristics might be treated as more private without disproportionately affecting their accuracy [3]. Finally, differential privacy increases transparency—the Bureau can release their implementation of the method and that will not affect the privacy of the respondents [3].

The New Normal

Differential privacy is the most mathematically rigorous way of dealing with sensitive data. And it should be applied to much more than just the censuses. Whenever we are handling private information—whether sharing results of clinical trials or training neural networks on sensitive data—we have an obligation to ensure the privacy of individuals whose data are being used. At the same time, I am not optimistic that everyone will accept the use of this counter-intuitive concept of tweaking real-world data in such important procedures as national censuses. Unfortunately, there is probably no other way.

Code

The code that I used for data reconstruction examples can be found here.

References

S. Garfinkel, J. Abowd, and C. Martindale, Understanding database reconstruction attacks on public data, Communications of the ACM, vol. 62, no. 3, pp. 46–53, 2019. doi:10.1145/3287287
Questions Asked on the Form, United States Census 2020, [Online]. Available: https://2020census.gov/en/about-questions.html [Accessed: Nov. 10, 2020].
K. Polich, Differential Privacy at the US Census, Data Skeptic, 2020. [Online]. Available: https://traffic.libsyn.com/secure/dataskeptic/differential-privacy-at-the-us-census.mp3 [Accessed: Nov. 19, 2020].
C. Dwork, F. McSherry, K. Nissim, and A. Smith, Calibrating noise to sensitivity in private data analysis, In Proc. Theory of cryptography conference, 2006, pp. 265–284. doi:10.1007/11681878_14

because we can blame them for everything ↩︎
bolean SATisfiability problem ↩︎
I made up the tables on the left just to illustrate the point. In reality, the possiblities would depend on the nature of applied noise. ↩︎

Updated Blog Structure

Dovydas Joksas — Tue, 17 Nov 2020 11:00 +0000

After some thought, I decided to restructure my blog. I realized that I occasionally want to share some small updates—a book or an article that I would like to recommend, a new paper of mine, or an announcement like this one (meta, right?). But it did not seem right to put it together with all of my other writing which up until now I considered to be what my blog would all be about. Therefore, I decided to split my blog into two sections: ‘Essays’ and ‘Updates’.

‘Essays’ will contain my long-form writing that I put some effort into, i.e. things I have been putting on this website up until now. For some reason, the word ’essay’ always sounded very pretentious to me, but that’s what a lot of people whom I respect call their pieces of writing on their own websites, so I guess I will stick with it.

‘Updates’ will contain quick updates or generally just a less formal kind of writing. I envision these as tweets (sometimes longer than 280 characters), just on my website.

When moving to the new structure, I wanted to avoid breaking anything. Specifically, I wanted the old URLs to work (and ideally redirect to updated URLs), and to ensure that my RSS feeds do not interpret old blog posts as new ones. Items in my RSS feeds use URLs as unique identifiers, which made me realize that this change could incorrectly notify the subscribers of those feeds about old posts because their URLs have changed. Thankfully, Hugo website generator (that I use to build my website) allowed me to address both of these issues and everything should work just fine. If you spot any strange behavior though, please let me know.

Let's Save the Internet. With RSS.

Dovydas Joksas — Wed, 23 Sep 2020 10:25 +0000

Almost everything you do on the Internet is guided by opaque algorithms. Machine-learning-powered sites deliver you your search results, decide whose posts you should see on social media and which of your emails should go to the spam folder. And although I enjoy receiving occasional spam email (because they all prematurely refer to me as Dr Joksas and ask me to give a plenary talk at the hottest neuroscience conference of the year), it’s probably a good thing overall that these algorithms enable us to focus on the important emails. What about everything else though? It seems that in the last few years it became more difficult to find relevant information. Exponential growth of information is partly to blame, but it might also be a result of a very basic incentive misalignment—what the tech companies want is different from what we want.

One of the most fundamental ways in which machine learning has shaped the Internet since 2012 is in the personal tailoring of content to each of us. Before that, social media sites (as an example) used to serve us content in a chronological order; that is either no longer an option on most of these sites, or is a well-hidden opt-in feature that they prefer you would not use at all. By tracking our behavior, they can now train neural networks to serve us content in such a way that maximizes some objective measure, like the time spent on their platform. Why they would do it is obvious—more time spent on their website means more ads shown to us which means higher revenues. The content that’s tailored to us is marketed as “relevant”, but a more accurate term would be “addictive”. There is maybe 2-minutes’ worth of relevant content on your Twitter feed every day, but even that little amount is hard to uncover.

So in a world of infinite streams of information and algorithms trying to manipulate us, how do we find the news that is relevant to us, how do we get updates from people whose opinions we care about, and, more generally, how do we interact with the Internet? Do we each train our personal neural networks that would learn what each of us cares about? NO! In a world of complexity, we should not seek to introduce additional complexity. What we need is simplicity and for that we will have to go back in time!

Where We’re Going, We Don’t Need Machine Learning

More than 20 years ago, a technology, called RSS (Really Simple Syndication), emerged. Its aim was to keep track of many different websites so that the users wouldn’t have to check for updates manually in each one of them. For example, if you were interested in news sites X and Y, and a blogger Z, you could subscribe to their RSS feeds and get notified whenever a new piece of content would come out. How? Using an RSS feed reader that automatically checks for updates in those feeds and then aggregates the content in one place. I know this may sound a bit confusing and given that I am obviously advocating for RSS as a replacement for social media (besides other things) when it comes to us getting updates about the world, I think it makes sense to explain how it really works. Because what is the point otherwise? If you cannot explain the basic principles behind RSS and what its limitations are, it might seem no better than using social media to interact with the Internet.

So how does RSS work?

RSS uses XML files to encode information. These are just simple text files that are usually used for structured data. In the case of RSS, they can be used to describe individual items of content in a feed. Let me illustrate it with an example. Suppose I wanted to make an RSS feed for my blog; I could create an XML file like this¹:


  
    </span>dovydas.com Blog<span style="color:#ff79c6">
    https://dovydas.com/blog/
    This is my blog!
    
    
      </span>Colorblind-Friendly Diagrams<span style="color:#ff79c6">
      https://dovydas.com/blog/2020-07-06-colorblind-friendly-diagrams/
      06 Jul 2020
      This is my second blog post!
    

    
      </span>Do Not Trust WhatsApp<span style="color:#ff79c6">
      https://dovydas.com/blog/2020-07-01-do-not-trust-whatsapp/
      01 Jul 2020
      This is my first blog post!

In this file, I defined an RSS feed (using tags) for my blog and named it “dovydas.com Blog” (using </code> tags). Additionally, I added two blog posts (using <code><item></code> tags) with specific titles (using <code><title></code> tags). For both the feed and individual posts, I included URLs (using <code><link></code> tags) and short descriptions (using <code><description></code> tags). Finally, I included publication dates (using <code><pubDate></code> tags) for both blog posts.</p> <p>Now that the file is prepared, I could essentially store it anywhere on my website so that others could access it. Visitors to the site would copy the URL to that file and then paste it into their RSS reader. That RSS reader would download the file and display the various elements of the feed to the user. For example, <a href="https://tt-rss.org/" rel="noreferrer" target="_blank">my RSS reader of choice</a> would display this feed in the following way:</p> <div > <figure > </figure> </div> <p>We can see that the RSS reader presented all the information that we specified in the XML file: blog name, blog post titles, publication dates and descriptions. Additionally, I can press on the titles of individual posts and the RSS reader will redirect me to my website because the URLs were specified.</p> <p>All RSS readers update the feeds at some regular intervals, i.e. they download and process the XML file again. So, whenever I, as an owner of a website, want to add a new piece of content to the website, I should also update the XML file containing the RSS feed. That is the only way in which RSS readers will be able to detect any changes.</p> <p>And that’s it! It’s that simple. RSS feed is just a text file that presents content in a structured, machine-readable format.</p> <h3 id="how-to-use-rss">How to use RSS?</h3> <p>Most RSS readers allow you to subscribe to as many feeds as you want. Here is what my own general feed is usually like:</p> <div > <figure > </figure> </div> <p>I know that content-wise this looks like social media and appearance-wise it looks like an email client. The reason I prefer RSS over social media though is because I get updated about everything—news, Twitter, YouTube, blogs, podcasts, and more—in one place, the results are not filtered by some machine learning algorithm and I can process the information in any way I want.</p> <p>Similar to the way email clients allow you to take care of your emails, most RSS readers enable you to categorize the feeds, add your own tags and organize them the way you prefer. That is the beauty of text files—they are simple, but they allow you to add a lot of modular functionality. For example, some RSS readers might provide the option to process your feeds using <a href="https://en.wikipedia.org/wiki/Regular_expression" rel="noreferrer" target="_blank">regular expressions</a>. Suppose you subscribed to a dozen technology-oriented RSS feeds but were mostly interested in robotics. You could tell your RSS reader to mark any RSS item with a tag “robotics” if it contained a string “robot” in its title or description. That way, you could easily focus on the news that you are most interested in.</p> <h2 id="the-arrow-of-time">The Arrow of Time</h2> <p>Technologies tend to become more accessible over time. Whether it is building websites, writing LaTeX documents or training image classifiers, tools are constantly being developed that allow people without in-depth knowledge to effortlessly use these technologies. The same could be said of RSS, which saw the development of hundreds of feed readers in forms of websites, desktop applications, browser extensions and even terminal applications. Unfortunately, these advances in ease of use have been canceled out by an active opposition to the adoption of RSS in a large number of sectors. In surveillance capitalism, there exist incentives not only to track user behavior, but also to oppose any technology that prevents you from doing it.</p> <p>The most blatant example of this behavior is, obviously, social media companies. Both Facebook and Twitter used to offer RSS feeds, but that is no longer the case. It is quite amusing that Facebook still recognizes the power of RSS, although in a very selfish way. They have this feature called instant articles that allows publishers to submit their articles to Facebook <a href="https://developers.facebook.com/docs/instant-articles/publishing/setup-rss-feed" rel="noreferrer" target="_blank">using RSS</a> or other methods. They claim it improves loading times on mobiles, but obviously their goal was to increase their own tracking capabilities even further <a href="https://www.facebook.com/business/help/2012051535674814?id=858706964600987" rel="noreferrer" target="_blank">which is possible</a> now that these articles are hosted on their platform.</p> <p>There is some hope though. In media that do not rely on such an extreme user tracking you still see some support for RSS. A large proportion of news outlets still offer their own RSS feeds where they put summaries of the articles and link back to their website. Some non-profit publications, such as <a href="https://www.quantamagazine.org/" rel="noreferrer" target="_blank">Quanta Magazine</a>, can even afford to put the whole content of their articles inside the RSS feeds—that way the users can read the articles even without leaving their RSS readers. I too put my full blog posts inside my RSS feeds; I do not place any ads on my website and so do not care whether people access the posts directly or through their feed readers<sup id="fnref:2"><a href="#fn:2" role="doc-noteref">2</a></sup>.</p> <p>But just because someone relies on ads to generate revenue, does not necessarily mean they cannot utilize RSS. Podcasts are the best example of this. They are, almost by definition, RSS feeds—podcast creators add a new RSS item for each episode and associate it with an audio file that anyone can download. That is the reason they do not have to upload each episode manually to Spotify, iTunes, Google Podcasts or wherever else people listen to podcasts—all these services read the RSS feeds automatically. Podcasters usually earn money by doing dumb ads, i.e. ads that are not personalized. Because of their simple format, they can simply be read out during each episode and thus be embedded in the audio file; a link to that file is part of the RSS feed. So, no. RSS does not prevent you from making money.</p> <p>Overall, I have mixed feelings about the future of RSS and the Internet in general. Sure, there are people who create tools for constructing RSS feeds of websites that do not offer them, but it’s a constant struggle—someone finds a way to automatically extract Facebook posts and then Facebook changes its interface, after which the cycle repeats. I am also not optimistic about how many people would be willing to go back to a more primitive technology, even if it gives them more control. And yet I see no other way. In a world where there is a widespread belief that advertisers must know customers’ religious beliefs and a thousand other things to successfully sell them a teapot, we will naturally converge towards more addictive social media and more intrusive tracking. Only by rejecting these business models completely can we incentivize the tech companies to change. In the meantime, RSS seems like a good enough alternative.</p> <div role="doc-endnotes"> <hr> <ol> <li id="fn:1"> <p>This is a simplified example that is missing a lot of standard tags. <a href="#fnref:1" role="doc-backlink">↩︎</a></p> </li> <li id="fn:2"> <p>The exception is posts that include math typography (which requires JavaScript). Although I still include these posts in my RSS feeds, I also add a note saying that they will not be rendered correctly in RSS feed readers and they should instead be read directly on my website. <a href="#fnref:2" role="doc-backlink">↩︎</a></p> </li> </ol> </div> </article> <article> <h1>Memristive neural networks perform better when they work in teams</h1> <p>Dovydas Joksas — Thu, 27 Aug 2020 11:00 +0000</p> <p><em>Originally published <a href="https://devicematerialscommunity.nature.com/posts/memristive-neural-networks-perform-better-when-they-work-in-teams" rel="noreferrer" target="_blank">here</a>—written by my supervisor, <a href="https://www.ee.ucl.ac.uk/~uceeadm/">Dr Adnan Mehonic</a>, and me. In the blog post, we discuss our recent paper <a href="https://doi.org/10.1038/s41467-020-18098-0" rel="noreferrer" target="_blank">“Committee machines—a universal method to deal with non-idealities in memristor-based neural networks”</a>.</em></p> </article> <article> <h1>“Dude, Machine Learning is Just Glorified Curve Fitting”</h1> <p>Dovydas Joksas — Tue, 04 Aug 2020 11:00 +0000</p> <p><em><strong>Note:</strong> This blog post contains LaTeX typography that will not be rendered correctly by RSS readers. To read the blog post with its original formatting, please visit <a href="https://dovydas.com/blog/dude-machine-learning-is-just-glorified-curve-fitting/" rel="noreferrer" target="_blank">https://dovydas.com/blog/dude-machine-learning-is-just-glorified-curve-fitting/</a>.</em></p> <p>There is this video of David Bowie being interviewed by Jeremy Paxman that I keep coming back to. The interview took place in 1999 and one of the topics they discuss is the Internet. Specifically, Bowie expresses his opinion that “the potential of what the Internet is going to do to society—both good and bad—is unimaginable”. What is much more interesting to me though is Paxman’s skepticism: “It’s just a tool though, isn’t it? <…> It’s simply a different delivery system”. You can watch the full exchange on YouTube using <a href="https://youtu.be/FiK7s_0tGsg?t=645" rel="noreferrer" target="_blank">this link</a> (the most relevant part from 10:45 to 11:30).</p> <p>For us, the enlightened people of 2020, it feels like Paxman made a fool of himself. But my point is that Paxman was technically correct. It is just that saying “$X$ is just $Y$” may not necessarily diminish the importance of $X$ if we are underestimating the power of $Y$ or if it is too vaguely defined.</p> <p>One of the most popular variants of this meme today is the saying “machine learning is just glorified curve fitting”. It seems to have been popularized by Judea Pearl and is now the main argument of the folks that question the significance of recent advances in machine learning or are skeptical about its potential to replace jobs, for example. Although you might already (correctly) guess that I am of the opinion that this way of thinking is mostly flawed, in some ways I can actually sympathize with it too.</p> <h2 id="curve-fitting">Curve Fitting</h2> <h3 id="2-d">2-D</h3> <p>Before digging into machine learning, let us clarify what curve fitting even is; it might be illustrated with a classic experiment. Suppose you have a container of gas and you measure its pressure, $P$, at various temperatures, $T$. You then plot $P$ versus $T$ and find that the relationship between the two (at least in the range that you measured) is almost perfectly linear, i.e. $P(T) = aT + b$, where $a$ and $b$ are some constants. This is an example of fitting a straight line on a graph of two variables, which can then be used to predict the value of one variable given the value of the other.</p> <h3 id="9000-d">9000-D</h3> <p>In machine learning, one might try to achieve something similar. For example, a neural network could be trained to differentiate between dogs and cats when provided with their images (which are collections of pixels with certain intensities of red, green and blue colors). A trained fully-connected neural network with one hidden layer would produce the relationship $\mathbf{F}(\mathbf{x}) = \sigma_2(\mathbf{A}_2 \sigma_1(\mathbf{A}_1 \mathbf{x} + \mathbf{b}_1) + \mathbf{b}_2)$ between an input image $\mathbf{x}$ and a function $\mathbf{F}$ with outputs labeled “dog” and “cat”, where $\mathbf{A}_1$, $\mathbf{b}_1$, $\mathbf{A}_2$ and $\mathbf{b}_2$ are just collections of constants and $\sigma_1$ and $\sigma_2$ are some non-linear functions. Although scarier looking, this model performs the same task—predicts an output given some input. You can visualize<sup id="fnref:1"><a href="#fn:1" role="doc-noteref">1</a></sup> this as fitting a multi-thousand-dimensional surface using millions of parameters.</p> <p>Large machine learning models are not elegant at all. They have millions or even billions of parameters that model the relationships between inputs and outputs. And often our intuition is that this is just an extension of the same curve fitting principle we have seen with pressure and temperature, just with more inputs and more outputs. However, as the sizes of these models grow, behaviors sometimes emerge that mimic human cognition.</p> <h2 id="alien-life-form">Alien Life Form</h2> <p>GPT-3 is the largest language model in the world; it might be the best example to illustrate my point. In the simplest terms, language model is an auto-completer of words, sentences, paragraphs, etc. Even the most basic of them should be able to complete sentences like “My favorite color is __” with words like “red” or “light blue” instead of “hamburger” or “Richard Nixon”. GPT-3 is much more sophisticated. For example, when provided with a definition of a made-up word, this language model can use it in context [<a href="#bibreference-1" title="T. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, 2020. [Online]. Available: https://arxiv.org/abs/2005.14165 ">1</a>]:</p> <blockquote> <p><strong>Human input:</strong><br> A “Burringo” is a car with very fast acceleration. An example of a sentence that uses the word Burringo is:<br> <strong>GPT-3 output:</strong><br> In our garage we have a Burringo that my father drives to work every day.</p> </blockquote> <p>There are many examples of what GPT-3 is capable of doing but to me the most fascinating one is its ability to perform basic arithmetic. When asked “What is 3 plus 2?”, the model will almost certainly correctly answer “5”. Of course, what could have happened is that the model might have found such a sentence in its training data set and simply memorized it. That is almost certain with one-digit addition given how much text GPT-3 was trained on. However, it performs almost as well on two- or three-digit addition and subtraction. Out of 2,000 three-digit subtraction problems, only 0.1% of which appeared in the training data set, GPT-3 was able to compute the answer correctly 94% of the time [<a href="#bibreference-1" title="T. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, 2020. [Online]. Available: https://arxiv.org/abs/2005.14165 ">1</a>]. And the mistakes that it made are human-like—“forgetting” to borrow a “1”, for example. But it does carry or borrow a “1” when it needs to most of the time which in itself is impressive because it was not explicitly told the rules of addition and subtraction. It is amazing that with enough data even a relatively simple<sup id="fnref:2"><a href="#fn:2" role="doc-noteref">2</a></sup> model can “learn” mathematical rules by analyzing text and extracting a pattern.</p> <h2 id="conclusion">Conclusion</h2> <p>Most machine learning models are indeed just a high-dimensional equivalent of curve fitting. But thinking about them on this level of abstraction is simply not helpful. I agree with the sentiment that just building larger models is not a feasible strategy to achieve artificial general intelligence. However, we keep getting surprised by how good these dumb curve fitting algorithms can be at performing cognitive tasks. My enthusiasm and concern regarding machine learning stems from the belief that with large enough models emergent behaviors arise that emulate the way we, humans, perform many tasks. I fear that no matter how anticlimactic it sounds, functions with a few billion parameters might replace millions of jobs. Not because these mathematical functions can develop some sort of advanced intelligence, but because a lot of what we do does not require it.</p> <h2 id="references">References</h2> <ol ><li id="bibreference-1">T. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, <em>et al.</em>, <q>Language models are few-shot learners,</q> 2020. [Online]. Available: <a rel="noopener" target="_blank" href="https://arxiv.org/abs/2005.14165">https://arxiv.org/abs/2005.14165</a> </li></ol> <div role="doc-endnotes"> <hr> <ol> <li id="fn:1"> <p>just kidding <a href="#fnref:1" role="doc-backlink">↩︎</a></p> </li> <li id="fn:2"> <p>structure-wise, not size-wise <a href="#fnref:2" role="doc-backlink">↩︎</a></p> </li> </ol> </div> </article> <article> <h1>The Destroyer of Worlds</h1> <p>Dovydas Joksas — Thu, 16 Jul 2020 11:00 +0000</p> <p>75 years ago today, the first nuclear weapon was detonated. As Robert Oppenheimer recalled of that day, <em>“We knew the world would not be the same.”</em> Although we are no longer (technically) in the Cold War, the threat of a nuclear apocalypse is as real as it has ever been since 1945. And yet I feel stupid writing about it—contrary to the evidence, this scenario feels so improbable, something you would only see in a sci-fi movie or a David Lynch creation<sup id="fnref:1"><a href="#fn:1" role="doc-noteref">1</a></sup>.</p> <p>And it is true that, at least for now, no single bomb is capable of destroying the whole of humanity. But having a weapon that is so powerful naturally produces a system of strategies and incentives in which the detonation (or even a perceived threat) of a single bomb can easily lead to the launch of a hundred rockets equipped with nuclear warheads. What is so challenging to comprehend is that <em>this is most likely to happen by accident</em>. It is difficult to estimate the probabilities of events that have never happened before, but looking at the history of humanity since the Trinity test it seems that we are really really really lucky to have not experienced a nuclear war yet.</p> <h2 id="close-calls">Close Calls</h2> <h3 id="october-1960">October 1960</h3> <p>To detect any incoming missiles from the Soviet Union, the United States built a large number of radars in various countries. On October 5, 1960, one such radar in Greenland sent an automated warning to the North American Aerospace Defense Command (NORAD) in the US notifying that a huge number of missiles had been detected, suggesting with 99.9% certainty that the USSR is attacking the US [<a href="#bibreference-1" title="E. Schlosser, Command and control: Nuclear weapons, the damascus accident, and the illusion of safety. Penguin, 2013. ">1</a>]. One can only imagine the amount of panic there must have been at the NORAD.</p> <p>One thing made no sense at all though—at that time, USSR leader Nikita Khrushchev was in New York! Given the low likelihood of such a Shakespearean scenario, there was no immediate military response. It was later found that the radar in Greenland had actually detected the rising of the Moon, and not a swarm of missiles.</p> <h3 id="january-1961">January 1961</h3> <p>On January 24, 1961, an American bomber B-52 broke up while flying over Goldsboro, a city in North Carolina. It was carrying two Mark 39 thermonuclear weapons, each of which was around 270 times more powerful than the bomb dropped over Hiroshima. The plane crashed and the bombs dropped in a field; fortunately, neither of them actually exploded. However, a <a href="https://www.theguardian.com/world/interactive/2013/sep/20/goldsboro-revisited-declassified-document" rel="noreferrer" target="_blank">document</a> acquired under the Freedom of Information Act in 2013 reveals a horrifying reality. Mark 39 bombs were equipped with six interlocking mechanisms, all of which had to be triggered for the bomb to explode. Inspection of one of the bombs found that “five of the six interlocks had been set off by the fall”. Had the bomb exploded, it could have resulted not only in hundreds of thousands of lives lost, but also in a perception that the US was being attacked by a foreign country.</p> <h3 id="september-1983">September 1983</h3> <p>Having witnessed the Hungarian Uprising with his own eyes<sup id="fnref:2"><a href="#fn:2" role="doc-noteref">2</a></sup>, Yuri Andropov, the head of the KGB and later the leader of the Soviet Union, over the years became increasingly paranoid about possible threats to the Soviet Union. George H. W. Bush’s White House <a href="https://apps.washingtonpost.com/g/documents/world/read-the-us-assessment-that-concluded-the-soviet-leadership-feared-an-american-nuclear-strike-in-1983/1779/" rel="noreferrer" target="_blank">top-secret intelligence review</a><sup id="fnref:3"><a href="#fn:3" role="doc-noteref">3</a></sup> declassified in 2015 states that “[The President’s Foreign Intelligence Advisory Board (PFIAB)] believe[s] that the Soviets perceived <…> that the chances of the US launching a nuclear first strike [during the first years of Reagan’s presidency] <…> were growing. [The PFIAB] also believes that the US intelligence community did not at the time <…> attach sufficient weight to the possibility that the war scare was real.” For example, in 1981 Andropov, as the head of the KGB, initiated Operation RYaN (РЯН). Its aim was to collect intelligence on a nuclear attack that the Reagan administration might have been planning (as speculated by the Soviets).</p> <p>Even if unjustified, the fear was real. The tension was at its highest in 1983 when the Soviets shot down a South Korean airliner carrying 269 people, including a US congressman, on September 1. What happened 25 days later is probably the closest we have been to a nuclear war.</p> <p>At that time, USSR had a protocol where if a notification was received from a warning system that incoming missiles had been detected, an <em>immediate</em> nuclear counter-attack would be launched against the US. A missile strike had been detected by an early-warning system on September 26, 1983 and a notification was sent to Stanislav Petrov, a duty officer who was working near Moscow. His job was simple—to report any such warnings to his superiors. But he did not. Suspecting a false alarm (<a href="https://www.bbc.co.uk/news/world-europe-24280831" rel="noreferrer" target="_blank">though not being completely sure about it</a>), he made a decision that he was not supposed to make. The highest ranking officials in the Soviet Union, including its leader Yuri Andropov, were ready to launch a retaliatory strike. But by making the decision to dismiss the warning himself, Stanislav Petrov may have saved the world. And it did not take long to find out that he had made the right call—there were no missiles after all.</p> <h2 id="conclusion">Conclusion</h2> <p>These are just a few of many examples of nuclear close calls. The Cold War ended three decades ago but the nuclear weapons are still here—fourteen thousand of them, actually. Whether it is the US and North Korea, India and Pakistan, or <a href="https://www.youtube.com/watch?v=SUuOskX3z7U" rel="noreferrer" target="_blank">a group of radicals with access to a uranium enrichment facility</a>, a nuclear war can erupt very easily. We want to think that the decision to start such a war would be carefully thought through; the history shows us that this is unlikely. As a matter of fact, some nuclear strike authorization protocols make it so that no <em>conscious</em> decision would need to be made at all. A faulty switch or a software bug could initiate a series of events that, <a href="https://www.netflix.com/title/80190519" rel="noreferrer" target="_blank">like some mad conductor</a>, would move us toward a war that <em>no one</em> wants.</p> <h2 id="references">References</h2> <ol ><li id="bibreference-1">E. Schlosser, <em>Command and control: Nuclear weapons, the damascus accident, and the illusion of safety</em>. Penguin, 2013. </li></ol> <div role="doc-endnotes"> <hr> <ol> <li id="fn:1"> <p>Twin Peaks season 3, episode 8: <em>Gotta Light?</em> <a href="#fnref:1" role="doc-backlink">↩︎</a></p> </li> <li id="fn:2"> <p>while serving as the Soviet ambassador to Hungary. <a href="#fnref:2" role="doc-backlink">↩︎</a></p> </li> <li id="fn:3"> <p>If you are in Europe, you might need a VPN to access this. Or you can try <a href="https://assets.documentcloud.org/documents/2484214/read-the-u-s-assessment-that-concluded-the.pdf" rel="noreferrer" target="_blank">this link</a> instead. <a href="#fnref:3" role="doc-backlink">↩︎</a></p> </li> </ol> </div> </article> <article> <h1>Colorblind-Friendly Diagrams</h1> <p>Dovydas Joksas — Mon, 06 Jul 2020 11:00 +0000</p> <p><em><strong>Note:</strong> This blog post contains LaTeX typography that will not be rendered correctly by RSS readers. To read the blog post with its original formatting, please visit <a href="https://dovydas.com/blog/colorblind-friendly-diagrams/" rel="noreferrer" target="_blank">https://dovydas.com/blog/colorblind-friendly-diagrams/</a>.</em></p> <p><a href="https://ghr.nlm.nih.gov/condition/color-vision-deficiency#statistics" target="_blank" rel="noopener" >According to the National Institutes of Health</a>, around 1 in 12 males and 1 in 200 females have some form of color vision deficiency. I will admit that until recently I have not thought about the implications of this on my work. But many of us, especially in research, use colors in diagrams to communicate ideas. And as much as we like to associate abstract concepts with colors (e.g. “good is green, bad is red”), it might backfire on us. I will borrow an example from a 2002 paper by Okabe and Ito [<a href="#bibreference-1" title="M. Okabe and K. Ito, How to make figures and presentations that are friendly to color blind people, University of Tokyo, 2002. ">1</a>]. Imagine that you submit a manuscript to a journal and it is then sent to three male reviewers (which even today is not unrealistic in some fields). The probability that at least one of them is colorblind is ~23%<sup id="fnref:1"><a href="#fn:1" role="doc-noteref">1</a></sup>.</p> <p>Does this mean we should avoid using colors in our diagrams and instead utilize patterns and symbols? Not necessarily. If we are smart about it, we can design colorful diagrams that are accessible to colorblind people.</p> <h2 id="the-nature-of-colorblindness">The Nature of Colorblindness</h2> <p>In most cases, <em>colorblind people can still see colors</em>, though usually in a limited range. Human eyes have three special types of cells that are sensitive to mainly red, green and blue light [<a href="#bibreference-1" title="M. Okabe and K. Ito, How to make figures and presentations that are friendly to color blind people, University of Tokyo, 2002. ">1</a>]. Each cell type is associated with a gene; if one of those genes mutates, it changes how we perceive colors. Most colorblind people have problems with either the “red” or the “green” gene. A much rarer occurrence is people with a mutated “blue” gene. People with mutated “red”, “green” and “blue” genes are said to have <strong>protanopia</strong>, <strong>deuteranopia</strong> and <strong>tritanopia</strong>, respectively [<a href="#bibreference-1" title="M. Okabe and K. Ito, How to make figures and presentations that are friendly to color blind people, University of Tokyo, 2002. ">1</a>].</p> <h2 id="types-of-palettes">Types of Palettes</h2> <p>When designing diagrams, there are three main types of color palettes that you might use: sequential, diverging and categorical. Sequential and diverging palettes are usually used for continuous variables (like temperature) and can be constructed using just one or two main colors, in addition to a neutral color (e.g. black, gray or white), by producing a gradient. Categorical palettes, on the other hand, are used either for data that are discrete in nature or to differentiate between different categories. In most cases, palettes of this type have more colors. Because of this, I will focus on categorical palettes—they represent a more general case and so the findings will also be applicable to sequential and diverging palettes where fewer unique colors are needed.</p> <h2 id="the-problem-with-plotting-software">The Problem with Plotting Software</h2> <p><em>Whatever plotting software you use, the chances are it might not be optimized for colorblind people by default</em>.</p> <p>I do most of my programming in Python and so naturally I use <a href="https://matplotlib.org/" target="_blank" rel="noopener" >matplotlib library</a> for plotting any experimental or simulation results. Unfortunately, the default color cycle used by matplotlib for categorical data is not colorblind-friendly and <a href="https://github.com/matplotlib/matplotlib/issues/9460" target="_blank" rel="noopener" >this issue is known</a>. In the image below, using <a href="https://www.color-blindness.com/coblis-color-blindness-simulator/" target="_blank" rel="noopener" >this online tool</a>, I simulated how matplotlib’s color cycle would appear to people with the three types of colorblindness. We can see that especially people with protanopia and deuteranopia might find it challenging to distinguish between some of the colors. One of the main issues here might be the fact that matplotlib uses a lot of them. The more colors one wants to add to a palette, the more difficult it becomes to find a set of colors that are easy to distinguish, especially when you want to make the palette colorblind-friendly.</p> <p>Back in 2014, <a href="https://www.mathworks.com/products/matlab/matlab-graphics.html" target="_blank" rel="noopener" >MATLAB introduced a new default color cycle</a>. Because they had gotten rid of the bright red, I assumed this new palette is colorblind-friendly. However, while researching for this blog post I could not find any official information on their website and after I simulated how their default color cycle is perceived by colorblind people (in the image below), I am no longer so sure anymore. Although it is better than matplotlib’s default colors, it seems to me that people with protanopia and deuteranopia might find it difficult to distinguish between colors 2 and 5—they would be perceived as having almost identical hue and similar brightness (or at least that is what my eyes are telling me…).</p> <div > <figure > </figure> </div> <h2 id="a-better-palette">A Better Palette</h2> <p>Color palette that I have now adopted is the one by Okabe and Ito. It seems to have been popularized by Bang Wong’s Nature Methods column in 2011 [<a href="#bibreference-2" title="B. Wong, Points of view: Color blindness, Nature Methods, vol. 8, no. 6, p. 441, 2011. doi:10.1038/nmeth.1618">2</a>]. In the image below, you can see the original palette and simulations of how it is perceived by people with colorblindness. It is not perfect—for example, colors 2 and 7 might look quite similar to people with protanopia, but at least there is a noticeable difference in brightness, which is larger than in the MATLAB example. Also, people with tritanopia might find it difficult to differentiate between some of the colors. However, tritanopia is order of magnitudes rarer than, for example, protanopia, so that might be a reasonable compromise.</p> <div > <figure > </figure> </div> <p>RGB and CMYK color code values for Okabe & Ito palette are listed below. The palette also includes black color which could be considered cheating as it would be distinguishable in all color palettes… Personally, I use it only if necessary and have thus set it as my eighth (instead of first) color because I want to distinguish diagram elements from the text, which is usually typed in black. However, in most cases, you do not even need that many different colors, so that is not an issue.</p> <table> <thead> <tr> <th>Color</th> <th>RGB (0-255)</th> <th>CMYK (0-100)</th> <th>Hex</th> </tr> </thead> <tbody> <tr> <td><span style="text-decoration: underline; text-decoration-color: #000000">Black</span></td> <td>0, 0, 0</td> <td>0, 0, 0, 100</td> <td>000000</td> </tr> <tr> <td><span style="text-decoration: underline; text-decoration-color: #E69F00">Orange</span></td> <td>230, 159, 0</td> <td>0, 50, 100, 0</td> <td>E69F00</td> </tr> <tr> <td><span style="text-decoration: underline; text-decoration-color: #56B4E9">Sky blue</span></td> <td>86, 180, 233</td> <td>80, 0, 0, 0</td> <td>56B4E9</td> </tr> <tr> <td><span style="text-decoration: underline; text-decoration-color: #009E73">Bluish green</span></td> <td>0, 158, 115</td> <td>97, 0, 75, 0</td> <td>009E73</td> </tr> <tr> <td><span style="text-decoration: underline; text-decoration-color: #F0E442">Yellow</span></td> <td>240, 228, 66</td> <td>10, 5, 90, 0</td> <td>F0E442</td> </tr> <tr> <td><span style="text-decoration: underline; text-decoration-color: #0072B2">Blue</span></td> <td>0, 114, 178</td> <td>100, 50, 0, 0</td> <td>0072B2</td> </tr> <tr> <td><span style="text-decoration: underline; text-decoration-color: #D55E00">Vermilion</span></td> <td>213, 94, 0</td> <td>0, 80, 100, 0</td> <td>D55E00</td> </tr> <tr> <td><span style="text-decoration: underline; text-decoration-color: #CC79A7">Reddish purple</span></td> <td>204, 121, 167</td> <td>10, 70, 0, 0</td> <td>CC79A7</td> </tr> </tbody> </table> <h2 id="final-remarks">Final Remarks</h2> <p>Although I provide some recommendations, this blog post is not meant to be a complete guide to preparing colorblind-friendly diagrams. Whatever you use for plotting—whether it is Excel, ggplot2, matplotlib, MATLAB or something else—you should familiarize yourself with the capabilities and limitations of these tools. Some plotting software does not implement colorblind-friendly palettes by default but still offers them to the users. For example, matplotlib does not implement a colorblind-friendly palette in its default color cycle, but provides the option to use <a href="https://matplotlib.org/3.1.0/tutorials/colors/colormaps" target="_blank" rel="noopener" >sequential, diverging and categorical palettes</a> that are accessible to people with color vision deficiencies.</p> <p>Even if your software does not offer any colorblind-friendly palettes, there are a lot of online tools that you might find useful. For example, <a href="https://www.color-blindness.com/coblis-color-blindness-simulator/" target="_blank" rel="noopener" >this tool</a>, which I mentioned previously, allows you to simulate how any image is perceived by people with color vision deficiencies. It can be used to make sure that your diagrams are accessible to colorblind people, though have in mind that every simulation has its limitations. You can also try out <a href="https://colorbrewer2.org" target="_blank" rel="noopener" >this tool</a> which allows to build sequential, diverging and categorical palettes that are colorblind-friendly.</p> <p>Whether by using built-in functionality of your software or by employing external tools, you should consider colorblind readers when designing your diagrams. Not only is it the right thing to do, but it will also make your work more impactful—after all, more people will be able to understand it.</p> <h2 id="references">References</h2> <ol ><li id="bibreference-1">M. Okabe and K. Ito, <q>How to make figures and presentations that are friendly to color blind people,</q> <em>University of Tokyo</em>, 2002. </li><li id="bibreference-2">B. Wong, <q>Points of view: Color blindness,</q> <em>Nature Methods</em>, vol. 8, no. 6, p. 441, 2011. doi:<a rel="noopener" target="_blank" href="https://doi.org/10.1038/nmeth.1618">10.1038/nmeth.1618</a></li></ol> <hr> <p><strong>Update #1</strong> (July 20, 2022): I added hex values for Okabe-Ito palette.</p> <div role="doc-endnotes"> <hr> <ol> <li id="fn:1"> <p>$1 - \left( 1 - 1/12 \right)^3 \approx 0.23 = 23%$ <a href="#fnref:1" role="doc-backlink">↩︎</a></p> </li> </ol> </div> </article> <article> <h1>Do Not Trust WhatsApp</h1> <p>Dovydas Joksas — Wed, 01 Jul 2020 11:00 +0000</p> <p>WhatsApp, with its 2 billion users, is the most popular messaging app in the world. In addition to a user-friendly interface and many useful features, it claims to offer end-to-end encryption, meaning that only the sender and the recipient, and no one in-between, can read the messages. However, some do not feel comfortable using WhatsApp because it is owned by Facebook whose track record on data privacy is <a href="https://dictionary.cambridge.org/dictionary/english/understatement" rel="noreferrer" target="_blank">not great</a>. It might indeed be a problem, but not the biggest one. The fact that <strong>WhatsApp is a closed-source software</strong> is what should worry you the most.</p> <h2 id="why-open-source">Why Open Source?</h2> <p>I am not yet an open-source evangelist but when it comes to data privacy and secure communications, the code being available to the public is a must for me. WhatsApp, like many other messaging platforms, claims to have implemented the <a href="https://signal.org/docs/" rel="noreferrer" target="_blank">Signal Protocol</a>. It is an open-source protocol and the gold standard of end-to-end encrypted communication—it ensures secure exchange of messages between two or more individuals. But although the Signal Protocol is open source, WhatsApp is not. How that impacts us depends on which one (or superposition) of the following universes we live in: The Good, the Bad and/or the Ugly.</p> <h3 id="the-good">The Good</h3> <p><em>WhatsApp developers have only good intentions and do not make any mistakes</em>.</p> <p>It is possible that WhatsApp has indeed implemented the Signal Protocol without any backdoors or bugs. It would make little sense for them to risk their reputation and not ensure the one thing that they became known for—privacy. As a matter of fact, if you analyse the messages that WhatsApp transmits, they do seem to be encrypted; well, at least there is no obvious way to decrypt them without a cryptographic key.</p> <h3 id="the-bad">The Bad</h3> <p><em>WhatsApp is evil.</em> (Dear Mark Zuckerberg, this is a hypothetical. Please do not sue me. I am poor.)</p> <p>Even if WhatsApp communications are encrypted in some way, there is much more to secure exchange of information than just sending an encrypted message. In the end, we still want at least two parties to be able to read those messages; this is where cryptographic keys come in. The Signal Protocol uses a complex process of generating these keys for two (or more) users (see <a href="https://www.youtube.com/watch?v=DXv1boalsDI" rel="noreferrer" target="_blank">this Computerphile video</a>, for example). However, with WhatsApp being closed source, there is no way of verifying if this process is properly implemented. It is, in theory, possible for them to introduce a backdoor without our knowledge. Such a backdoor could be abused either by the company itself (for commercial purposes) or by other entities, such as governments.</p> <h3 id="the-ugly">The Ugly</h3> <p><em>WhatsApp developers make mistakes.</em></p> <p>As a matter of fact, all developers make mistakes. With any large software project, there will almost certainly be a scenario that results in an unintended behaviour of the program. Although some of the deviations from the engineered behaviour might be harmless, secure messaging apps should be held to a much higher standard. Having code open to the public allows many more people to spot potential vulnerabilities, which can then be dealt with. Of course, one can make an argument that keeping the code closed source would not allow the bad guys to spot the vulnerabilities in the first place and then exploit them. In practise, this strategy usually does not work because you do not necessarily need to see the code to exploit the vulnerabilities of the software.</p> <h2 id="what-is-the-alternative">What is the Alternative?</h2> <p>Due to its popularity, I focused on WhatsApp in this post, but these considerations apply to any messaging platform that claims to offer end-to-end encryption, and yet is closed source. Why you should care about privacy at all is a post for another time. However, if you do care about it and you think in terms of potentiality, WhatsApp is <em>not</em> the way to go.</p> <p>My personal recommendation is <a href="https://signal.org/en/" rel="noreferrer" target="_blank">Signal</a>, which I use to communicate with most of my friends and family. It is provided by a non-profit organization lead by Moxie Marlinspike who originally co-developed the Signal Protocol. Signal has state-of-the-art security features and, importantly, is open source. Even if you are not a cryptographer (neither am I!), it makes it easier for those with the expertise to inspect the code and suggest improvements (as is constantly happening on <a href="https://github.com/signalapp" rel="noreferrer" target="_blank">their GitHub page</a>). Signal has now become the natural first choice for journalists, activists and even politicians. We all should have the right to privacy and, fortunately, the technology for that is here.</p> </article> <article> <h1>Analogue Solution to the Power-Hungry Field of Machine Learning</h1> <p>Dovydas Joksas — Fri, 28 Feb 2020 11:00 +0000</p> <p><em>Originally published <a href="https://www.linkedin.com/pulse/analogue-solution-power-hungry-field-machine-learning-dovydas-joksas/" rel="noreferrer" target="_blank">here</a></em>.</p> <p>Machine learning methods—although performing many cognitive tasks successfully—are very time- and power-consuming, and, consequently, bad for the environment. For example, training a large neural network can emit as much carbon dioxide as five cars in their lifetimes [<a href="#bibreference-1" title="E. Strubell, A. Ganesh, and A. McCallum, Energy and policy considerations for deep learning in NLP, 2019. [Online]. Available: http://arxiv.org/abs/1906.02243 ">1</a>]. Because of the ever-increasing use of machine learning techniques (such as neural networks) it is important to consider whether it is possible to make them more efficient. The main roadblock for further rapid development might be our current hardware platforms that we use to implement machine learning. Alternatives, such as physically implemented neural networks, have been suggested to match the hardware to the task.</p> <p>Currently, machine learning algorithms—like most others—are being implemented in digital computing systems; these rely on storing information in binary format and processing it using binary logic. In such digital computers, computations are well-controlled, predictable and accurate. However, even simple tasks are not trivial to realise and require a lot of space—storing an 8-bit number requires tens of transistors, while performing mathematical operations (e.g. multiplication) with such numbers requires thousands of transistors. Furthermore, conventional computers use an architecture which separates memory and computing. Such separation results in large amounts of information being shuffled between these two modules during data-intensive tasks. This presents an enormous challenge and a bottleneck for improving power efficiency.</p> <p>Using analogue approach, as well as merging memory and computing modules together, has been suggested in the past. Analogue implementation would mean that devices, that are capable of encoding numbers on a continuous range, would be employed, thus increasing the density of encoded information. However, it is often feared that such systems would not be accurate enough for practical purposes. Nevertheless, such implementations could increase the speed and power efficiency of various computing operations by orders of magnitude. It is thus important to consider whether it is possible to employ them at least in some contexts without sacrificing the accuracy.</p> <p>Artificial neural networks (ANNs) seem to be one of the machine learning models that can handle a certain amount of inaccuracy associated with analogue computing. ANNs are structures that consist of two main elements: neurons and synapses (see diagram in Figure 1). Synapses amplify the signals, while neurons add and transform those signals; as a whole they can perform tasks like classification (more information on how ANNs operate can be found <a href="https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi" rel="noreferrer" target="_blank">here</a>).</p> <div > <figure > <figcaption ><strong>Figure 1:</strong> Architecture of artificial neural networks with neurons represented using coloured circles and synapses as connections between those circles. Input (e.g. an image of a handwritten digit) is fed into the leftmost neuronal layer, after which a signal propagates to the right. A prediction (e.g. guessing what digit the image depicts) is made by the rightmost neuronal layer of the network.</figcaption> </figure> </div> <p>Analogue implementations of ANNs could be realised using devices, called memristors, whose resistance can be easily varied—a key property of synaptic behaviour. Thus, memristors could implement the synapses of neural networks—devices would be arranged in two-dimensional structures, called crossbar arrays, that resemble the way synapses are arranged in ANNs. Most importantly, this would enable to easily perform matrix-vector multiplication operations that are the costliest ones in these networks. In this way, ANNs could be made orders of magnitude more efficient.</p> <p>Resistive random-access memory (RRAM) devices—being a type of memristor—are one of the most promising candidates for such implementations. They can be manufactured using conventional semiconductor materials and are relatively easy to integrate with current computing systems. Although ANNs can tolerate some amount of inaccuracies due to their very parallel and interconnected nature, the more precisely one can program RRAM devices, the better the performance of an ANN will be. Thus, many groups working on RRAM are trying to optimise their various properties: dynamic range, chance of failure, programming non-linearities, current/voltage non-linearities, device-to-device variability, and others. However, there is little prioritisation of some properties over others because the way each of these non-idealities affect ANNs is not fully understood.</p> <h2 id="our-approach">Our Approach</h2> <p>We decided to look at each of the properties (shown in Figure 2) separately and evaluate their effect on network performance. We simulated many different configurations of ANNs, and then disturbed their weights to reflect the effects of these RRAM properties. These randomized disturbances were applied multiple times to get a reliable estimate of the average effect that they have on physically implemented neural networks.</p> <div > <figure > <figcaption ><strong>Figure 2:</strong> Properties of RRAM devices that have an influence on ANN accuracy.</figcaption> </figure> </div> <p>In our analysis [<a href="#bibreference-2" title="A. Mehonic, D. Joksas, W. Ng, M. Buckwell, and A. Kenyon, Simulation of inference accuracy using realistic RRAM devices, Frontiers in Neuroscience, vol. 13, p. 593, 2019. doi:10.3389/fnins.2019.00593">2</a>], we discovered that different non-idealities have very different effects on ANN performance. We found that in realistic scenarios small proportion of devices failing, and programming or current/voltage curves being non-linear, are tolerable—the decrease in inference accuracy is relatively small. The range of resistances that the devices can be set to, i.e. the dynamic range, can have a much more detrimental effect. However, it can be mitigated by employing a different mapping scheme. That is, it is possible to represent the synaptic weights using resistances so that the accuracy would not be affected, even if the range of those resistances is small. We found that the most important factor affecting accuracy is device-to-device variability. When the manufactured devices do not respond to the inputs in the same way, the accuracy can drop considerably. Besides, this non-ideality is more difficult to deal with as it cannot be avoided by simply using a different mapping or programming scheme.</p> <p>Although some qualitative trends were observed, at this moment it is difficult to make generalised quantitative conclusions that would be applicable to all RRAM devices. The nature of non-idealities differs not only with different materials, but also with different physical dimensions of the devices. Although the effects of properties like dynamic resistance range are applicable to many different variants of RRAM devices, some other non-idealities, such as device-to-device variability, can vary a lot between differently manufactured devices. Besides, non-idealities can manifest themselves differently when using different network architectures. Thus, this is just the beginning of an exploration to build a more complete picture of the non-idealities of RRAM devices and the effect they have on the neural networks that they constitute.</p> <h2 id="where-are-we-at">Where are we at?</h2> <p>Analogue devices, such as RRAM, could potentially solve problems of high power consumption of ANNs. Although there are several competing types of analogue devices being considered, a focused approach to device optimisation will be key to their successful integration into mainstream computing systems. Instead of optimising device properties that are important in conventional memory technology, it will be crucial to understand what role each of them plays in the specific context of neural networks.</p> <p>Despite the rapid development of RRAM devices’ technology, the discussion about the relative importance of their various non-idealities in physically implemented ANNs has been very limited. Thus, it is still difficult to take a structured approach to the optimisation of these devices. Our simulation results show that from all RRAM device non-idealities, device-to-device variability can have the largest effect. Although the nature of this and other non-idealities can differ in different types of RRAM devices, a systematic approach that we take provides a more comprehensive understanding of how various device properties affect ANN accuracy. We hope that this analysis will inform researchers trying to optimise RRAM devices suited for physical implementations of neural networks and that it will accelerate the growth of this exciting field even further.</p> <p><em>Thanks to <a href="https://www.sunnybains.com/" rel="noreferrer" target="_blank">Dr Sunny Bains</a> for reading the drafts of this post and for helping me organize my thoughts.</em></p> <h2 id="references">References</h2> <ol ><li id="bibreference-1">E. Strubell, A. Ganesh, and A. McCallum, <q>Energy and policy considerations for deep learning in NLP,</q> 2019. [Online]. Available: <a rel="noopener" target="_blank" href="http://arxiv.org/abs/1906.02243">http://arxiv.org/abs/1906.02243</a> </li><li id="bibreference-2">A. Mehonic, D. Joksas, W. Ng, M. Buckwell, and A. Kenyon, <q>Simulation of inference accuracy using realistic RRAM devices,</q> <em>Frontiers in Neuroscience</em>, vol. 13, p. 593, 2019. doi:<a rel="noopener" target="_blank" href="https://doi.org/10.3389/fnins.2019.00593">10.3389/fnins.2019.00593</a></li></ol> </article> <article> <h1>The Power Crisis of Modern Computing</h1> <p>Dovydas Joksas — Tue, 03 Dec 2019 11:00 +0000</p> <p><em>Originally published <a href="http://energyjournal.co.uk/Edition_8#8_A6" rel="noreferrer" target="_blank">here</a></em>.</p> </article> </main></body></html>