As artificial intelligence becomes more integrated into search engines and digital assistants, non-profits must understand how their websites are indexed, interpreted, and surfaced by AI models. This article outlines how to balance visibility and data control, particularly regarding AI training data, robots.txt configurations, and SEO strategies tailored for non-profits.
What Is AI Training Data and Why Does It Matter?
AI training data refers to the massive collections of information used to teach machine learning models how to understand and generate content. This data often includes publicly available website content, academic articles, code repositories, and more.
For non-profits, this means that portions of your website could potentially be used to train large language models (LLMs) like OpenAI’s GPT or Google Gemini. While exposure can increase visibility, some organizations may wish to restrict content due to privacy concerns, outdated material, or mission-specific sensitivities.
According to OpenAI, websites can opt out of future GPT training by using specific directives in their robots.txt file (source).
How to Prevent GPT and Other AI Models from Crawling Your Site
If your organization wants to prevent LLMs from using your content for training, the most effective first step is to configure your site’s robots.txt
file. Here’s how:
Example: Blocking GPTBot (OpenAI)
User-agent: GPTBot
Disallow: /
Example: Blocking Common AI Crawlers
User-agent: GPTBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: AnthropicAI
Disallow: /
You can also block all bots with:
User-agent: *
Disallow: /
However, this will prevent your site from being indexed by traditional search engines like Google and Bing, so it’s generally not recommended for visibility purposes.
To implement this:
- Access the root directory of your website.
- Open or create a file named
robots.txt
. - Paste the appropriate lines of code.
- Save and test using tools like Google’s Robots.txt Tester.
AI Search Is Changing SEO: What Non-Profits Need to Know
AI-enhanced search results, such as Google Search Generative Experience (SGE) or Bing Copilot, are increasingly answering user queries directly without users clicking through to websites. To stay visible, non-profits need a refined SEO strategy optimized for how AI summarizes and ranks content.
Key Optimization Strategies:
1. Write for Questions, Not Just Keywords
AI search tools prioritize clear, well-structured answers. Optimize pages to directly answer:
- What does your organization do?
- Who do you serve?
- How can people get involved or donate?
Use H2/H3 tags to structure answers clearly and match query intent.
2. Use Schema Markup
Implement structured data to help search engines and AI systems understand the context of your content. Key schema types for non-profits:
Organization
Event
FAQPage
HowTo
DonateAction
3. Focus on E-E-A-T
Google emphasizes Experience, Expertise, Authoritativeness, and Trustworthiness (E-E-A-T). For non-profits:
- Publish content from credible authors or team members
- Include testimonials and case studies
- Secure your site (HTTPS)
4. Optimize for Voice and Conversational Queries
AI tools pull conversational language. Instead of targeting “nonprofit donation page,” target:
- “How can I support local nonprofits in Indiana?”
- “Where can I donate to help families in need near me?”
5. Keep Content Updated and Evergreen
AI models prioritize freshness. Make sure your website:
- Includes current programs and services
- Features regularly updated blog content
- Removes outdated event pages or links
Balancing Visibility with Privacy
Non-profits may struggle to balance visibility with discretion. Here are some guidelines:
Objective | Recommended Action |
---|---|
Maximize visibility | Allow all major search engines and AI bots to crawl public pages |
Protect sensitive content | Use robots.txt to disallow specific directories or bots |
Avoid misuse of media | Use watermarks or no-index tags on private photos/videos |
If you’re unsure which parts of your site should be crawled, consult with a web strategist or privacy expert.
FAQ: AI Search and Non-Profit Web Strategy
What is AI search and how does it differ from traditional search engines? AI search uses large language models to summarize answers directly on the results page. Unlike traditional search that links out to websites, AI search may bypass clicks entirely.
Should my non-profit block AI crawlers? Only if you’re concerned about content reuse or privacy. Blocking all crawlers can reduce visibility, so make selective decisions based on content type.
What kind of content performs best in AI search? Concise, structured answers to common questions, backed by expertise and clear formatting, perform best. Also, structured data markup helps.
How often should I update my website? Aim to update content at least quarterly, with fresh blog posts, testimonials, and event recaps.
What tools can I use to track SEO performance? Use tools like Google Search Console, SEMrush, and Ahrefs to monitor keyword rankings and indexing.
Ready to Be Found by the Right People?
Higgens Media helps non-profits across Indiana increase visibility through:
- Search Engine Optimization (SEO)
- Video storytelling and donor outreach
- Website design and content strategy
Whether you want to protect sensitive content or expand your mission’s reach, we’ll guide you through the right steps.
Let’s talk about your story. Schedule a free consultation today.