Back to timeline
hobby 2020

Colour Polygraph

A self-directed experiment from the second year of upper-secondary school. I built a twenty-question colour survey, sent it to around 160,000 students through the Oslo school directory, and ended up with 6,731 cleaned sessions and a small neural net that could guess a participant's age, gender and self-reported mood from nothing but which colour swatches they picked.

Explore the data

Before the write-up, the toy. Drag the cube to rotate it, double-click to resume the slow auto-rotate. The buttons swap between boys, girls, the difference between them, and a free pair of sliders. Age filtering is off by default; tap "All ages" to switch the age slider on. Each glowing dot is one cell of an 8×8×8 RGB grid, sized by how often colours from that cell were picked compared to how often they were shown.

R G B
all all ages n = 0

Most-loved colours

Voxels with the strongest preference score: shown to many people and picked far more often than chance.

Most-rejected colours

Voxels people were shown over and over but almost never picked. Muddy yellow-browns lose every popularity contest.

Headline numbers

    Mood by group

    Self-reported 0–60 score, lower is happier. Adolescence visibly bites.

    The question

    Does which colours you prefer, and how you click on them, carry enough signal to predict who you are? I had a hunch the answer was yes, and I wanted the dataset to be big enough that the answer wasn't a coincidence.

    The survey

    Participants answered three demographic questions (age, gender and a 1 to 10 mood rating), then completed twenty trials. Each trial showed four colour swatches and asked them to pick the one they liked best. After all twenty, the sixteen surviving favourites were paired off in groups of four, then in a final round of four, until one colour was left standing.

    Alongside the picks, the survey quietly logged the meta-signal: how long each response took, where on the swatch the click landed, whether the cursor wavered, and the order participants tended to scan the options.

    Reach and response

    At sixteen the most surprising part wasn't the model. It was watching a dataset of twenty thousand human responses arrive in two days.

    Cleaning the data

    The dataset arrived dirty. Around 20,000 people finished the survey, but the cube at the top only runs on 6,731 of them. Everything else got filtered out by a chain of sanity checks that I built up as I noticed the failure modes:

    What survives is the 6,731 sessions where I'm reasonably sure a real human was making a real choice. That's the cleaned set the cube at the top is drawing from.

    The model

    Features

    Architecture

    A small fully-connected network. Three targets, trained jointly: age (regression), gender (binary classification), mood (1 to 10 ordinal). I leaned on the meta-features as much as the colour picks themselves. They turned out to carry a lot of the signal, especially for age.

    What I learned

    Three things. First, that how you click is at least as informative as what you click. Reaction time and click coordinates carried more weight in the age head than the colour picks themselves. Second, that running a study at scale is less about the model and more about the boring plumbing around it: surveys, storage, deduplication, abuse handling, the cleaning pipeline I just walked through. The model itself was the smallest file in the project.

    Third, that almost two thirds of a "completed" dataset can be noise. The 13,000 sessions I threw away weren't all malicious; plenty were kids speed-running for the sake of it, or people typing 69 as their age out of habit. The lesson stuck. Every dataset I've touched since then I treat as guilty until proven clean.

    I'd do this differently today: clearer consent, better aggregation, a published write-up. It was a high-school project and it shows in places, but it taught me that data beats cleverness, and I've never quite let go of that.

    Explore the data yourself

    Everything that powers the cube above lives in the public repo. The raw export, the Python cleaner that turned 20,000 rows into 6,731, and the aggregator that produced the JSON the viz reads, are all in the project folder. If you want to try a different binning, a different filter, or train your own little net on it, this is where to start.

    Back to timeline