Open Source & Attributions
The open-source software, open data, and third-party services that make save.ag possible
Built on Open Source
Save.ag is built almost entirely with open-source software. We believe open knowledge should be built on open tools — the communities that create and maintain these projects make our work possible, and we want to acknowledge that clearly.
This page lists every significant open-source component we use, the license under which it is provided, every open data source we draw from, and every third-party service the platform relies on. A machine-readable Software Bill of Materials (SBOM) is also available for our production dependencies.
Core Infrastructure
The foundation of Save.ag runs on these open-source systems:
| Software | Role | License |
|---|---|---|
| Python | Primary programming language | PSF License |
| PostgreSQL | Database | PostgreSQL License |
| pgvector | Vector similarity search for RAG retrieval | PostgreSQL License |
| PostGIS | Geospatial queries for climate zone mapping | GPL-2.0 |
| Linux | Server operating system | GPL-2.0 |
Python Libraries
These libraries power the web application, content processing, and data pipelines:
Web Framework & Server
| Package | Role | License |
|---|---|---|
| Flask | Web application framework | BSD-3-Clause |
| Jinja2 | HTML template engine | BSD-3-Clause |
| Werkzeug | WSGI toolkit | BSD-3-Clause |
| Gunicorn | Production WSGI server | MIT |
| Flask-Compress | Response compression | MIT |
Data & Content Processing
| Package | Role | License |
|---|---|---|
| psycopg2 | PostgreSQL database adapter | LGPL-3.0 |
| Requests | HTTP client | Apache-2.0 |
| Beautiful Soup | HTML parsing | MIT |
| Bleach | HTML sanitization | Apache-2.0 |
| Markdown | Markdown processing | BSD-3-Clause |
| Pillow | Image processing | HPND (Historical Permission Notice) |
| PyYAML | Configuration file parsing | MIT |
Geospatial & Security
| Package | Role | License |
|---|---|---|
| Rasterio | USDA climate zone raster lookup | BSD-3-Clause |
| GDAL | Geospatial data processing | MIT/X11 |
| NumPy | Numerical computation for embeddings and geospatial processing | BSD-3-Clause |
| bcrypt | Password hashing | Apache-2.0 |
Content Pipelines
Our backend content pipelines (transcript processing, abstract harvesting, web article extraction) use additional open-source libraries:
| Package | Role | License |
|---|---|---|
| Trafilatura | Web content extraction | Apache-2.0 |
| youtube-transcript-api | YouTube transcript retrieval | MIT |
| Pydantic | Data validation | MIT |
| lxml | XML/HTML parsing | BSD-3-Clause |
| httpx | Async HTTP client | BSD-3-Clause |
Transitive Dependencies
These libraries are pulled in automatically as dependencies of the packages listed above. Notable entries:
| Package | Pulled in by | License |
|---|---|---|
| click | Flask | BSD-3-Clause |
| MarkupSafe | Jinja2 | BSD-3-Clause |
| itsdangerous | Flask | BSD-3-Clause |
| blinker | Flask | MIT |
| urllib3 | Requests | MIT |
| certifi | Requests | MPL-2.0 |
| charset-normalizer | Requests | MIT |
| idna | Requests | BSD-3-Clause |
| Brotli | Flask-Compress | MIT |
| zstandard | Flask-Compress | BSD-3-Clause |
The full list of 80+ production dependencies (with exact versions and license identifiers) is available in our downloadable SBOM. Note: certifi uses the MPL-2.0 license (file-level weak copyleft). Since we use it unmodified as a dependency, no additional obligations apply.
Frontend
Save.ag uses no JavaScript frameworks, CSS frameworks, or third-party UI libraries. The entire frontend is built with vanilla HTML, CSS, and JavaScript — server-rendered with Jinja2 templates.
| Resource | Role | License |
|---|---|---|
| Poppins | Primary typeface (self-hosted) | SIL Open Font License 1.1 |
Poppins is self-hosted — no external font services are contacted. No third-party scripts, stylesheets, or CDN resources are used on client-facing pages.
Open Data Sources
Beyond software, Save.ag incorporates data from several open and public sources. We are grateful to the communities and institutions that make this data freely available.
Geospatial & Climate Data
| Source | Data Used | License / Terms |
|---|---|---|
| USDA Agricultural Research Service | Plant Hardiness Zone Map 2023 (GeoTIFF rasters) | Public domain (U.S. government work, USDA-ARS (opens in new window)) |
| OpenStreetMap (via Nominatim) | Geocoding for climate zone lookup | Open Database License (ODbL) |
| Open-Meteo | Historical temperature normals for zone calculation | CC BY 4.0 |
| Köppen-Geiger Climate Classification | Global climate zone raster | CC BY 4.0 (Beck et al., 2018 (opens in new window)) |
Research & Agricultural Data
| Source | Data Used | License / Terms |
|---|---|---|
| PubMed / NCBI | Academic abstract metadata for research citations | Public domain (U.S. government service); individual abstracts may be copyrighted by publishers |
| YouTube (via Data API) | Practitioner video transcripts, processed into topical segments | YouTube API Terms of Service; original content belongs to creators |
All content derived from these sources is cited with attribution. Academic abstracts are framed epistemically (“This study found”) and never presented as our own conclusions. Practitioner content is attributed to its original creator with links to the source material.
Third-Party Services
Save.ag relies on several third-party services for hosting, AI processing, analytics, and operations. These are not open-source software, but transparency about them is part of how the platform operates. For details on what data is exchanged with each service, see our Privacy Policy.
AI & Machine Learning
OpenRouter (opens in new window)
OpenRouter is the AI routing layer that proxies all LLM API calls to the underlying model providers. Every Ask query — query analysis, response synthesis, and embedding generation — flows through OpenRouter before reaching the appropriate model. OpenRouter operates under no-training agreements with its underlying providers, meaning user queries are not used to train models. Terms of Service (opens in new window).
Google Gemini 2.5 Flash Lite (opens in new window) (via OpenRouter)
Google Gemini 2.5 Flash Lite handles query analysis — classifying user intent and extracting entities such as practices, species, and locations from query text. Accessed via OpenRouter under a no-training agreement — queries are not used to train models. Google AI Terms (opens in new window).
Google Gemini 3.1 Flash Lite (opens in new window) (via OpenRouter)
Google Gemini 3.1 Flash Lite synthesizes the final answer shown to the user, drawing from retrieved source excerpts. It receives query text and retrieved source content during synthesis. Accessed via OpenRouter under a no-training agreement — queries are not used to train models. Google AI Terms (opens in new window).
OpenAI text-embedding-3-small (opens in new window) (via OpenRouter)
OpenAI’s text-embedding-3-small model converts user queries into 1536-dimensional vector embeddings for semantic similarity search against the RAG content corpus (~230,000 clusters). The query text is sent for embedding generation; the original text is not retained by OpenAI beyond the API call. Accessed via OpenRouter under a no-training agreement. OpenAI API Terms (opens in new window).
Hosting & Infrastructure
Fly.io (opens in new window)
Fly.io hosts the save.ag production application (Flask web server), the self-hosted Umami analytics instance, and the managed PostgreSQL database. All inbound traffic, compute, networking, and blue-green deployments run on Fly.io infrastructure in the sjc (San Jose, CA) region. All application traffic and logs pass through Fly.io infrastructure. Terms of Service (opens in new window).
Fly Managed PostgreSQL (opens in new window)
Fly Managed PostgreSQL is the primary database for save.ag, storing all site content, learn pages, user accounts, query logs, and analytics. It runs on Fly.io infrastructure with PgBouncer connection pooling. Extensions in use include pgvector (vector similarity search) and PostGIS (geospatial queries). The PostgreSQL software itself is open-source under the PostgreSQL License; Fly.io’s Terms of Service govern the managed hosting layer.
Cloudflare (opens in new window) (CDN & DNS)
Cloudflare provides DNS and CDN services for save.ag, with Full (strict) SSL/TLS termination. All traffic to save.ag resolves through Cloudflare before reaching Fly.io, providing DDoS protection and CDN caching. As a reverse proxy, Cloudflare processes IP addresses and HTTP request metadata for security and routing purposes. Terms of Service (opens in new window).
Cloudflare Turnstile (opens in new window)
Cloudflare Turnstile is a privacy-preserving CAPTCHA protecting the contact form, source submission form, and partner tools endpoints against automated spam. Turnstile performs a browser-side challenge using passive browser signals. It does not set cookies or track users across sites. The resulting verification token is sent to Cloudflare’s servers to confirm human interaction. Cloudflare Privacy Policy (opens in new window).
Analytics & Communications
Umami (opens in new window) (Self-Hosted — MIT License)
Umami is a privacy-focused analytics platform, self-hosted on save.ag’s Fly.io infrastructure. It tracks aggregate page visits, event counts (Ask queries submitted, Learn pages viewed, sections expanded, citations clicked), and referrers — without personal identifiers, cookies, or IP address storage. Because Umami is self-hosted, no analytics data is sent to third-party servers. DNT (Do Not Track) is respected. The software is open-source under the MIT License at github.com/umami-software/umami (opens in new window).
Gmail SMTP (opens in new window) (Google Workspace)
Gmail SMTP routes inbound contact form messages and source submission forms to the site operator. This is not transactional email to users — it is the delivery mechanism for form submissions (name, email address, message text) sent to save.ag’s operator inboxes. Contact form data passes through Google’s SMTP infrastructure in transit. Google Workspace Terms (opens in new window).
Resend (opens in new window)
Resend delivers transactional emails to registered users — account verification emails and password reset links. User email addresses and the content of these transactional messages pass through Resend’s infrastructure in transit. Resend Privacy Policy (opens in new window).
Data & Indexing
OpenStreetMap / Nominatim (opens in new window)
Nominatim geocodes location text from user queries (e.g., “farms near Fresno, CA”) to latitude/longitude coordinates for climate zone lookup. Location text strings and IP addresses are sent to the Nominatim API as part of the geocoding request. Results are cached locally to minimize API calls. Nominatim is powered by OpenStreetMap data; see the Open Data Sources section for licensing details. Nominatim Usage Policy (opens in new window).
Open-Meteo (opens in new window)
Open-Meteo provides 30-year historical temperature normals for USDA hardiness zone calculation, used when a location coordinate is not covered by the primary USDA raster. Geographic coordinates derived from user location queries are sent to Open-Meteo’s API; the IP address is included in the HTTP request. Results are cached in a local database with a 90-day TTL. See the Open Data Sources section for Open-Meteo data licensing (CC BY 4.0). Open-Meteo Terms (opens in new window).
CrossRef API (opens in new window)
CrossRef provides academic metadata for DOI lookups and abstract retrieval in the content pipeline. Save.ag uses CrossRef’s polite pool, which offers higher rate limits in exchange for including an operator contact email in the User-Agent header. No user data is sent to CrossRef; this is an operator-to-API relationship used in backend content ingestion, not in response to user queries. CrossRef Terms of Use (opens in new window).
IndexNow (opens in new window) (Bing Webmaster Tools)
IndexNow is an open protocol that notifies Bing, Yandex, and other participating search engines when pages are updated. After each production deployment, save.ag submits recently-updated page URLs to the IndexNow API. No user data is involved; this is a deployment-time operator action. IndexNow FAQ (opens in new window).
License Summary
The open-source licenses used across our stack fall into these categories:
| License Type | Category | Obligation |
|---|---|---|
| MIT | Permissive | Include license text |
| BSD-3-Clause | Permissive | Include license text; no endorsement claims |
| Apache-2.0 | Permissive | Include license and NOTICE; patent grant included |
| PSF License | Permissive | Include license text |
| MPL-2.0 | Weak copyleft (file-level) | Modifications to individual source files must be shared; using the package unmodified as a dependency (as we do) carries no additional obligation |
| LGPL-3.0 | Weak copyleft | Modifications to the library itself must be shared; using it as a dependency (as we do) carries no additional obligation |
| GPL-2.0 | Copyleft | Applies to PostGIS and Linux, used as server-side infrastructure; does not apply to application code that queries the database |
| SIL OFL 1.1 | Permissive (fonts) | Free to use; fonts cannot be sold alone |
All licenses in our stack are compatible with commercial use. We do not modify or redistribute any copyleft-licensed source code.
Software Bill of Materials
We publish a machine-readable Software Bill of Materials (SBOM) for our production deployment in CycloneDX (opens in new window) format. The SBOM lists every Python package deployed in production, including exact versions and package identifiers.
The SBOM is maintained in our project repository and updated with each production deployment.
Our Commitment
Save.ag, a California Public Benefit Corporation, is built to serve the public interest. We use open-source software because we believe the tools for building agricultural knowledge should be as open as the knowledge itself. We are committed to:
- Transparency: This page documents every significant dependency and third-party service. We keep it current as our stack evolves.
- Attribution: Every data source is cited where it appears on the site, and acknowledged here as well.
- Compliance: We respect the terms of every license and data agreement under which we operate.
- Giving back: Where we develop tools or methods that could benefit others, we open-source them.
Questions
If you have questions about our use of open-source software, data licensing, or attribution practices, please contact us at [email protected].
Last Updated
This page was last updated on May 5, 2026.