Generative AI Policy

I don’t apply generative AI or LLMs in my work at this time. You can trust that no artifacts that I author (prose or code) have involved them at any stage.

I don’t have any specific objection or coherent dogma around these tools; I try to follow the field very closely and I’m overall ambivalent on the topic.

[!info] Feb 2026: In a post on this blog which specifically relates to what a Google search returns, I’ve pulled a few quotes from the AI mode to comment on. This continues to be the extent of my use of AI for any artifacts or research.

My reasoning is as follows:

  • Per a variety of metrics, the conventional resources I’m using now seem to be working reasonably well.
    • I do believe that as a taxpayer-funded researcher I have an obligation to the public to use the best tools I can, not just what I happen to know or be comfortable with. So this might be a bit of a cop-out.

I’ve seen people pooh-pooh the next generation in ways that I think are not reasonable - an example would be telling off a grad student for watching a training video on a topic rather than reading a textbook. I don’t want to precipitate and continue that tradition.

However, this is my internal position at this time.

https://en.wikibooks.org/wiki/Wikibooks_talk:Artificial_Intelligence#Use_of_LLMs_for_this_policy_(and_evidence_of_issues)

nice slide deck.

helen reaction

ancilliary to all the discussion about the outputs, there’s the whole “scraperbots run amok” chilling effect.

https://arxiv.org/pdf/2402.08021

trying to make arguments about the energy efficiency or water use of AI are a bit on the nose, because look at the energy efficiency of the CLS. maybe 600 N shifts a year, 10 megawatts continuously - 146 megawatt-hours per shift.

On the other hand, what’s the power consumption of Rogers Center / SkyDome? Looks like it’s roughly in the same order of magnitude.

Carbon intensity is 630

Crucially, we need systems of accountability. Every line of code in a research project, whether written by humans or AI, must be attributable to a specific person who can defend its logic, validate its assumptions and explain its choices to reviewers. This means establishing clear lab policies about who takes responsibility for AI-generated code before it’s used, maintaining detailed audit trails of AI interactions and ensuring that whoever claims authorship of an analysis can actually explain how it works.

The neuroscientists who embrace this transition thoughtfully, learning to direct AI agents while maintaining critical oversight, will have tremendous advantages in tackling the field’s biggest challenges. Rather than viewing this as a threat, we can see it as an opportunity to focus our cognitive resources on the uniquely human aspects of science: asking better questions, designing more elegant experiments and developing deeper theoretical insights. The computational heavy lifting can increasingly be delegated to AI, freeing us to do what we do best.

The brain is the most complex system we know. We need every computational advantage available to understand it, and we have the wisdom to use these tools responsibly. The future of neuroscience depends on getting this balance right. The question isn’t whether we should use AI to accelerate our research—it’s how we’ll develop the expertise to use it exceptionally well.

https://www.thetransmitter.org/craft-and-careers/should-neuroscientists-vibe-code/

https://teaching.usask.ca/learning-technology/tool-sorting-lists/tools-by-type.php

some thoughtful guidance

I attended a presentation on time management today. It was unfortunate because the presenter clearly had a lot of personal experience with time management - but he didn’t put any of this in the presentation, which I’m quite certain was written with ChatGPT.

They say to “write what you know”

An update on my previous post https://mathstodon.xyz/@tao/115306424727150237 regarding how I was able to use modern AI to help solve a MathOverflow problem. Superficially, this story favors the “you should use AI whenever possible” narrative over the “you should avoid AI whenever possible” one; but, as I hope the discussion below will show, the more accurate narrative lies in between these two extremes. The story is told more or less completely at the MathOverflow page https://mathoverflow.net/questions/501066/is-the-least-common-multiple-sequence-textlcm1-2-dots-n-a-subset-of-t , but the responses are not arranged in chronological order, and so the history may be a little hard to follow from that page.

So, was AI-assistance “better” than crowdsourcing the problem, or working on the problem alone? It seems the answer is highly situational. For this particular problem, there was a key juncture where there was a significant “activation energy” required to make further progress, in which both individual or crowdsourced human attention was not sufficient to overcome; but this obstacle had favorable conditions for AI use, namely a well-understood task, decomposable into simpler steps, that could be verified externally. But through the “beat the AI” challenge, this AI use also made the crowdsourcing option available, leading to more creative (and more socially enjoyable) solutions that no longer required significant AI assistance, although some use was still helpful. It is conceivable that increased AI usage at this stage could have accelerated the advent of the final solution further by a few hours, but this was not a time-sensitive project, and I think it was a better outcome for the rest of the project to develop organically now that it had become unblocked. (5/5)

https://simonwillison.net/2025/Oct/14/nvidia-dgx-spark/

I feel like this is a useful example of how some models create some weird code.

# pick the first free UID >=1000
U=$(for i in $(seq 1000 65000); do if ! getent passwd $i >/dev/null; then echo $i; break; fi; done)
echo "Chosen UID: $U"

useradd -m -u "$U" -g "$G" -s /bin/bash dev

as far as I can tell, useradd already picks the first free uid when operated without an argument. that $U variable isn’t used anywhere else in the script, and letting useradd pick would have the benefit of being robust if the system UID # allocation changes.

Recently a fairly tech-averse friend-of-the-family asked how he can sort an excel table in a specific convoluted way, and he asked if any of the AI features could help. I just flatly didn’t know the answer to that - which is not a situation I like to be in

whiplash

https://words.filippo.io/claude-debugging/?source=Mastodon

On a whim, I figured I would let Claude Code take a shot while I read emails and resurfaced from hyperfocus. I mostly expected it to flail in some maybe-interesting way, or rule out some issues.

Instead, it rapidly figured out a fairly complex low-level bug in my implementation of a relatively novel cryptography algorithm. I am sharing this because it made me realize I still don’t have a good intuition for when to invoke AI tools, and because I think it’s a fantastic case study for anyone who’s still skeptical about their usefulness.

As ever, I wish we had better tooling for using LLMs which didn’t look like chat or autocomplete or “make me a PR.” For example, how nice would it be if every time tests fail, an LLM agent was kicked off with the task of figuring out why, and only notified us if it did before we fixed it?

This is an interesting take - from someone whose values I think I align with somewhat and who values storytelling - also using LLMs for practical purposes:

At first I tried LLMs for keyword matching, but the tone was SO wrong from how I usually write that it felt artificial. Then I remembered that I have content on this blog since 2003 (hell yeah Live Journal import!), all of which match my tone. So I asked the LLM to read my blog, store boilerplate resumes for each of the job types I was going for, and to keyword match a specific resume (or combo of resumes) to a JD while maintaining my tone. It worked wonders!

Always still check for accuracy, because hallucinations sure do happen. The things I would have been claiming to have done if I hadn’t checked! I spent time with each suggested set of changes to be sure they were accurate and in my tone. I also added a disclaimer at the bottom of my resume about how I use LLMs. When a particular JD would ask to not use LLMs, I would respect that.

https://blog.bl00cyb.org/2025/11/the-job-hunt/#more-6013

Full disclosure: about a week before their title was announced, which is like a year and a half ago, I was thinking of writing a book similar in theme, and I even had a title in mind, which was “The AI Con”! So I get it. And to be clear I haven’t read Bender and Hanna’s entire book, so it’s possible they do not actually dismiss it.

And yet, I think Wong has a point. AI not going away, it’s real, it’s replacing people at their job, and we have to grapple with it seriously.

Ignoring the chatbot era or insisting that the technology is useless distracts from more nuanced discussions about its effects on employment, the environment, education, personal relationships, and more.

I’m with Wong here. Let’s take it seriously, but not pretend it’s the answer to anyone’s dreams, except the people for whom it’s making billions of dollars. Like any technological tool, it’s going to make our lives different but not necessarily better, depending on the context.

https://mathbabe.org/2025/07/15/whats-the-right-way-to-talk-about-ai/

Much of the discussion around generative ai seems to center around “making workers more efficient”.

Maybe we should start using that language about things like DALYs or endometriosis, which damage workers’ efficiency.

I’m using a secret URL Gist served through gistpreview.github.io here to reduce the chance of this unverified slop content getting indexed by crawlers and making it out into the wider world.

https://til.simonwillison.net/llms/o4-mini-deep-research