on knowing what you do not know

the most dangerous phase in building something technical is when you are competent enough to move fast but not experienced enough to know which fast movements are mistakes.

i have been in that phase before. most builders have. you have internalized enough of the problem space that you stop second-guessing every decision. the second-guessing was slowing you down. removing it feels like progress. sometimes it is. sometimes it is just confidence outrunning understanding.

the tell is usually retrospective. you look back at a decision you made in month three and you can see the exact shape of what you did not know yet. the assumption you made that felt reasonable at the time but was actually load-bearing in a way you had not modeled. the design that was locally correct but globally wrong.

the thing about research

research is, among other things, a structured practice of knowing what you do not know.

not all research works this way in practice. a lot of research is aimed at confirming things you already believe, building evidence for a direction you have already committed to. that kind of research has its uses but it is not the kind that changes your thinking.

the research we have been doing lately has been the other kind. the kind where you start with a question and follow it without a strong prior on the answer. and several times in the last few weeks the answer has been something we did not expect. not dramatically. not in ways that invalidate prior work. but in ways that require updating.

i find that process uncomfortable in the moment and useful over time. it is uncomfortable because every unexpected finding is a small acknowledgment that the model you were operating with was incomplete. it is useful because an incomplete model that you know is incomplete is much safer to build on than an incomplete model you think is complete.

the SME problem

i have been preparing for a talk this week on identity verification and what it actually means for SME access to capital in nepal.

the more i have worked on this, the more i think the identity problem is not primarily a technical problem. the technology for verifying identity exists. what does not exist, at least not in any complete form, is infrastructure that makes that technology accessible and trustworthy enough that the institutions depending on it will actually stake lending decisions on it.

that gap is interesting because it is not a gap you can close by improving the model. you close it by being reliable over time. by building up a track record that institutions can point to when they make decisions based on your output. by operating in a way that earns the kind of institutional trust that takes years to accumulate and can be lost in an afternoon.

this is one of the things that makes helios feel important beyond the immediate commercial case. if we get it right, we are not just building a KYC product. we are building part of the trust infrastructure that makes a whole class of financial access possible for people who currently do not have it.

that framing keeps me honest about the work. it is easy to optimize for deployment speed or benchmark performance. it is harder but more important to optimize for the kind of reliability that justifies institutional trust.

what i keep coming back to

i have been reading a lot about how systems fail lately. not AI systems specifically. systems in general. bridges, hospitals, financial infrastructure, supply chains. the literature on failure is more interesting than the literature on success, mostly because failures are more honest. successes can be attributed to many things. failures usually have a specific cause that was visible in retrospect and invisible in real time.

the common thread across most of the catastrophic failures i have read about is not a single large mistake. it is a series of small decisions that each seemed reasonable given the information available at the time, that collectively created a condition where one more small decision produced a catastrophic outcome.

this pattern is relevant to AI systems in ways that i think about constantly. the failure modes for a reliable identity system are not dramatic. they are cumulative. a bias in the training data that produces slightly worse performance on a specific demographic. a threshold calibrated for a different distribution than the one it encounters in production. a design decision that optimizes for accuracy under normal conditions at the expense of robustness under edge cases.

none of these are flashy. all of them are serious.

knowing what you do not know means keeping these failure modes visible, continuously, even when the system is performing well. especially when the system is performing well.

like what you read? to get notified when I publish new essays, Subscribe to the newsletter