GenAI Policy

There are many generative AI tools out there, from general-purpose large language models (LLMs) and image generation models to research-grade software that is related to our own work. These tools must be used with care. The current policy is described below, which is in addition to—but does not replace—any  Princeton-wide policies  on the use of GenAI tools.
General CommentsIt is important to be aware of how our skills may be impacted when we overly rely on GenAI tools. I ask that you read " AI gravity ", which is a nice perspective on the topic.
If you use a cloud-based LLM, at minimum you should turn off data sharing so that your conversations are not used for model training.
With Claude, this can be found in Settings > Privacy > Help improve Claude (should be toggled off)
With ChatGPT, this can be found in Settings > Data controls > Improve the model for everyone (should be "Off")
Everyone in the group is welcome to have a subscription (paid for by the lab) to Claude Pro with access to Claude Code (or an equivalent service, such as ChatGPT Plus with access to Codex). Similarly, you are welcome to have a subscription to GitHub Copilot. If you are interested, ask Andrew how to get this paid for by the lab.
Everyone in the group is welcome to use the AI tools provided by the  Princeton AI Sandbox Service , which provides additional data privacy compared to the public tools. To access the Princeton AI Sandbox Service, ask Andrew to fill out the form.
If it is determined that you are over-relying on GenAI tools in a manner that impedes your learning or if you use GenAI tools in an inappropriate manner, your privileges for using them will be revoked.
Currently, there are no plans for the group to pursue research directions that specifically involve LLMs or agentic AI. We will leave that to others.
Acceptable and Unacceptable UsesLLM-generated code and writing is, on average, pretty mediocre. You should strive to do better than mediocre. In general, do not outsource your skills to an LLM.
LLMs are fine—and, perhaps, even ideal—to use for ideation (e.g. "discussing" possible research directions, explaining difficult concepts) and troubleshooting (e.g. compiler error, Python package installation error).
You should still think critically about the responses. LLMs generate many seemingly good ideas that, in fact, are not logical.
LLMs can be used to assist with the writing process, but there are many important caveats that you must adhere to:
The ideas must be your own, not those of the LLM.
The text must be in your own words. You should not copy text from an LLM wholesale. It is obvious if you do.
An acceptable use could include something like: "Here is a paragraph I am working on. The second sentence does not flow very nicely. Can your help me identify what is not working and how I can fix this?" And then you could modify your text, drawing inspiration from the LLM's response.
An unacceptable use would be something like: "I need to write an introductory paragraph on why MOFs are great. Please write this for me."
LLMs should never be used to prepare references. Use Zotero for actual reference management. Remember that you should be writing your text with references in mind as you write, not trying to sprinkle in random references after the fact.
LLMs regularly hallucinate references. There is a zero-tolerance policy for this in the group.
LLMs can be used to assist with coding, but you must still be in control of what you are doing. Here are some guidelines:
You must understand the code you use, and you should not rely on LLM-generated code without carefully checking everything. You do not want to get in a situation where your research results are impacted without your knowledge. 
If an LLM generates a coding pattern that you do not fully understand, take the time to figure it out. LLMs are great at generating code that superficially looks right but is, in fact, problematic.
LLMs are trained on publicly available code, whether it is right or not. As a result, LLMs often produce incorrect scripts with materials science tools like ASE, probably because the average person's ASE code on the internet is not very good.
If you are using a new code for the first time, you should not rely on an LLM to do it for you. This is very dangerous since you will not know if errors are made, and you will not learn the skills you need to improve as a programmer.
Do not upload any proprietary data or code to a cloud-based AI service, as this breaches confidentiality clauses even if you disable data sharing.
Image-based GenAI models should not be used for anything in a manuscript (e.g. for making figures/schemes). These models can hallucinate, and it also does not look very professional. By all means, use image-based GenAI models for fun or to brainstorm ideas. Their use in PowerPoint presentations should be done sparingly, if at all.
Unless agreed upon in advance as part of the project scope, you should not use agentic AI tools to run atomistic simulations. It is important for you to have full control and understanding of this process.
It hopefully goes without saying, but using GenAI tools to intentionally fabricate data is considered academic misconduct and will be treated as such. Don't even think about it.
﻿