Differs
Overview
A differ is applied to the filtered data if it has changed from the previous run(s). A differ summarizes the changes in the data and produces the content of the report sent to you. The output of the differ can be further filtered using any of the filters listed in Filters (see Filtering the diff below).
At the moment, the following differs are available:
unified: (default) Compares data line-by-line, showing changed lines in a “unified format”;
command: Executes an outside command that acts as a differ;
deepdiff: Compares structured data (JSON or XML) element-by-element;
table: A Python version of the unified differ where the changes are displayed as an HTML table;
wdiff: Compares data word-by-word, highlighting changed words and maintaining line breaks.
In addition, the following BETA differs are available:
A differ is specified using the job directive differ. To select a differ with its default directive values, simply
use the name of the differ as the directive’s value:
url: https://example.net/unified.html
differ: unified # this entire line can be omitted as it's the default differ
url: https://example.net/deepdiff.html
differ: deepdiff # use the deepdiff differ with its default values
Otherwise, the differ directive is a dictionary, and the name key contains the name of the differ:
url: https://example.net/unified_no_range.html
differ:
name: unified
range_info: false
unified (default)
This is the default differ used when the differ job directive is not specified (except, for backward compatibility,
when in the configration file the html report has the deprecated diff key set to table).
It does a line-by-line comparison of the data and reports lines that have been added (+), deleted (-), or changed. Changed lines are displayed twice: once marked as “deleted” (-) representing the old content, and once as “added” (+) representing the new content. Results are displayed in the unified format (the “unified diff”).
For HTML reports, webchanges colorizes the unified diff for easier legibility.
Examples
Using default settings:
url: https://example.net/unified.html
differ: unified # this can also be omitted as it's the default
Range information lines
Range information lines (those starting with @@) can be suppressed using range_info: false:
url: https://example.net/unified_no_range.html
differ:
name: unified
range_info: false
Context lines
The context_lines directive causes a unified diff to have a set number of context lines that might be different from
Python’s default of 3 (or 0 if the job contains additions_only: true or deletions_only: true).
Example using 5 context lines:
url: https://example.com/#lots_of_contextlines
differ:
name: unified
context_lines: 5
Output:
Optional directives
context_lines(int): The number of lines on each side surrounding changes to include in the report (default: 3).range_info(true/false): Whether to include line range information lines (those starting with@) (default: true).
Changed in version 3.21: Became a standalone differ. Added the range_info and context_line directives, the latter replacing the
job directive contextlines (added in version 3.0).
ai_google
Added in version 3.21: as BETA.
Added in version 3.33.
Prefaces a unified diff with a textual summary of changes generated by any of Google’s Gemini Generative AI models called via an API call. This can be free of charge for most developers.
Gemini models are the first widely available models with a large context window (currently 1 million tokens), which allow to analyze changes in long documents (of 350,000, or about 700 single-spaced pages) such as terms and conditions, privacy policies, etc. that other models can’t handle. For clarity, these models can handle up to approximately 700,000 words, but to do a comparison between two versions we need approximately half of this for the old text and the other half for the new text. They are also offered with a free tier.
Important
Requires a system environment variable GEMINI_API_KEY containing the Google Cloud AI Studio
API Key, which you obtain here and which itself requires a Google
Cloud account.
Warning
Gemini offers free use for developers for feedback and testing only (no production use; your data is used to train their models). If your use qualifies, you must create a free of charge plan, which you obtain by creating an API key from a (separate) Google Cloud project with billing disabled. Otherwise we highly recommend that you set up a budget with threshold notification enabled to avoid the potential for surprises!
By default, we specify the Gemini 2.0 Flash
model (gemini-2.0-flash) since it’s the last released model that allows 1,000,000 tokens per minute on the free
tier, and (if you are on a paid plan) is cheaper than Gemini 2.5 or
Gemini 3. You can find the full list of models here. To evaluate
responses between models side-by-side, use the tool here.
Tip
If you can fit your input within the 250,000 tokens per minute rate limit of the free tier we have been
having great success with the Gemini 2.5 Flash
(gemini-2.5-flash) and Gemini 3 Flash Preview
(gemini-3-flash-preview) models which we highly recommend trying.
You can change the default model in the configuration file as follows:
differ_defaults:
_note: Default directives that are applied to individual differs.
unified: {}
ai_google:
model: gemini-2.5-pro
command: {}
deepdiff: {}
image: {}
table: {}
wdiff: {}
Note
These models work with 38 languages and are available in over 220 countries and territories.
Warning
Generative AI can “hallucinate” (make things up), so always double-check the AI-generated summary with the accompanying unified diff.
The default prompt asks the Generative AI model make the comparison (see below for default prompt). However, to save tokens and time (and potentially $), you might want the model to only summarize the differences from a unified diff by using a prompt similar to the one here:
differ:
name: ai_google
prompt: >-
Describe the differences between the two versions of text as summarized in this unified diff.
Only highlight the most significant modifications.\n\n{unified_diff}
More information about writing input prompts for these models can be found here. You may also use the “Help me write” function in AI Vertex Vertex Prompt or ask the model itself (in AI Studio) to suggest prompts that are appropriate to your use case.
Example
Using the default prompt, a summary is prefaced to a unified diff:
| --- @ Thu, 01 Oct 2020 00:00:00 +0000 |
| +++ @ Thu, 01 Oct 2020 01:00:00 +0000 |
| @@ -1 +1 @@ |
| Sat Oct 1 00:00:00 UTC 2020 |
| Sat Oct 1 01:00:00 UTC 2020 |
Summary by Google Generative AI model gemini-2.0-flash
Tip
You can do “dry-runs” of this (or any) differ on an existing job by editing the differ in the job file and
running e.g. webchanges --test-differ 1 --test-reporter browser. Don’t forget to revert your job file if you
don’t like the new outcome!
Mandatory environment variable
GEMINI_API_KEY: Must contain your Google Cloud AI Studio API Key.
Optional directives
model(str): A model code (default:gemini-2.0-flash).system_instructions(str): Optional tone and style instructions for the model (default: see below).prompt(str): The prompt sent to the model; the strings{unified_diff},{unified_diff_new},{old_text}and{new_text}will be replaced by the respective content; Any\nin the prompt will be replaced by a newline (default: see below).timeout(float): The number of seconds before timing out the API call (default: 300).
Data to diff
additions_only(bool): provide a summary of only the new text (i.e. the lines added per unified diff).prompt_ud_context_lines(int): if{unified_diff}is present in theprompt, the number of context lines in the unified diff sent to the model (default: 999). If the resulting model prompt becomes approximately too big for the model to handle, the unified diff will be recalculated with the default number of context lines (3). Note that this unified diff is a different one than the diff included in the report itself.
Model tuning
temperature(float between 0.0 and 2.0): The model’s Temperature parameter, which controls randomness; higher values increase diversity (default: 0.0).thinking_budget(int): Only for Gemini 2.5: The model’s thinking budget; see model documentation (default: unset, effect varies by model as per documentation).thinking_level(‘low’, ‘medium’, or ‘high’): For Gemini 3 and above, the model’s thinking level; see model documentation (default: unset).top_k(int of 1 or greater): The model’s TopK parameter, i.e. k most likely next tokens to sample from at each step. Lower k focuses on higher probability tokens (default: Google’s default, which is model-dependent, but typically 1; see model documentation).top_p(float between 0.0 and 1.0): The model’s TopP parameter, or the cumulative probability cutoff for token selection. Lower p means sampling from a smaller, more top-weighted nucleus and reduces diversity (default: 1.0 iftemperatureis 0.0 (default), otherwise Google’s default, which is model-dependent, but typically 0.95 or 1.0; see model documentation).tools(list): Data passed on to the API’s ‘tool’ field, which calls a piece of code that enables the system to interact with external systems to perform an action, or set of actions, outside of knowledge and scope of the model (see here).
Note
You can learn about Temperature, TopK and TopP parameters here and here. In general, temperature increases creativity and diversity in phrasing variety, while top-p and top-k influences variety of individual words with low values leading to potentially repetitive summaries. The only way to get these “right” is through experimentation with actual data, as the results are highly dependent on the input and subjective to your personal preferences.
Underlying unified diff
unified(dict): Directives passed to unified differ, which prepares the unified diff attached to this report. Example:
command: date
differ:
name: ai_google
unified:
context_lines: 5
range_info: false
Default system instructions and prompts:
Special variables for prompt
When present in the prompt text, the following will be replaced:
{old_text}: Replaced with the old text.{new_text}: Replaced with the new (currently retrieved) text.{unified_diff}: Replaced with a unified_diff, with 999 context lines unless changed byprompt_ud_context_lines(see above).{unified_diff_new}Replaced with the added lines from the unified_diff, with the initial+stripped (e.g. roughly the new text).
Default
System instructions
You are a skilled journalist tasked with analyzing two versions of a text and summarizing the key differences in meaning between them. The audience for your summary is already familiar with the text’s content, so you can focus on the most significant changes.
Instructions:
Carefully examine the old version of the text, provided within the
<old_version>and</old_version>tags.Carefully examine the new version of the text, provided within the
<new_version>and</new_version>tags.Compare the two versions, identifying areas where the meaning differs. This includes additions, removals, or alterations that change the intended message or interpretation.
Ignore changes that do not affect the overall meaning, even if the wording has been modified.
Summarize the identified differences, except those ignored, in a clear and concise manner, explaining how the meaning has shifted or evolved in the new version compared to the old version only when necessary. Be specific and provide examples to illustrate your points when needed.
If there are only additions to the text, then summarize the additions.
Use Markdown formatting to structure your summary effectively. Use headings, bullet points, and other Markdown elements as needed to enhance readability.
Restrict your analysis and summary to the information provided within the
<old_version>and<new_version>tags. Do not introduce external information or assumptions.
Prompt
<old_version> {old_text} </old_version>
<new_version> {new_text} </new_version>
With additions_only
System instructions
You are a skilled journalist. Your task is to summarize the provided text in a clear and concise manner. Restrict your analysis and summary only to the text provided. Do not introduce any external information or assumptions.
Format your summary using Markdown. Use headings, bullet points, and other Markdown elements where appropriate to create a well-structured and easily readable summary.
Prompt
{unified_diff_new}
Changed in version No: changes are tracked here prior to v3.33 as the differ was in BETA; please refer to the changelog.
Changed in version 3.33: Removed the BETA tag.
Added thinking_level and media_resolution sub-directives.
command
Call an external differ. The old data and new data are written to a temporary file, and the names of the two files are appended to the command. The external program will have to exit with a status of 0 if no differences are found, a status of 1 if any differences are found, or any other status to signify an error (mimicking wdiff’s behavior).
If your differ outputs HTML, you should set is_html is true.
If wdiff is called, its output will be colorized when displayed on stdout (typically a screen) and for HTML reports.
However, we strongly recommend you use the built-in wdiff differ instead!
Tip
Use the job directive monospace if you want to use a monospace font in the report.
Example
url: https://example.net/command.html
differ:
name: command
command: python mycustomscript.py
is_html: true # if the custom differ outputs HTML
Note
See this note for the file security settings required to run jobs with this differ in Linux.
Changed in version 3.21: Was previously a job sub-directive by the name of diff_tool.
Changed in version 3.29: Added is_html sub-directive.
Changed in version 3.33: Added context_lines sub-directive.
Required directives
command: The command to execute.
Optional directives
is_html(true/false): Whether the output of the command is HTML, for correct formatting in reports (default: false).
Changed in version 3.29: Added is_html sub-directive.
deepdiff
Added in version 3.21.
Inspects structured data (JSON, YAML, or XML) on an element by element basis and reports which elements have changed, using a customized report based on deepdiff’s library DeepDiff module.
Examples
url: https://example.net/deepdiff_json.html
differ: deepdiff
url: https://example.net/deepdiff_xml_ignore_oder.html
differ:
name: deepdiff
data_type: xml # override deriving it from data type (fka MIME type)
ignore_order: true
Output:
With compact
url: https://example.net/deepdiff_xml_ignore_oder.html
differ:
name: deepdiff
compact: true # more compact YAML-style report, does not report type changes
Output:
Optional directives
data_type(json,yaml, orxml): The type of data being analyzed if different than the data’s media type (fka MIME type), defaulting tojsonif unable to derive.ignore_order(true/false): Whether to ignore the order in which the items have appeared (default: false).ignore_string_case(true/false): Whether to be case-sensitive or not when comparing strings (default: false).significant_digits(int): The number of digits AFTER the decimal point to be used in the comparison (default: no limit).compact(true/false): Produce a more compact YAML-style report which also ignores type changes (e.g. “type changed from NoneType to str”).
Note
When you set ignore_order: true, DeepDiff will treat lists as if they were sets. To compare two
sets, it needs to be able to pair up the items and DeepDiff’s default strategy is to try and hash the objects in the
list. However, if the items in the list are dictionaries, since they are not hashable in Python, when DeepDiff finds
a dictionary in new_data that has even a tiny difference from its counterpart in old_data, since it can’t be sure
they are “the same object, but modified”, it reports that the entire old dictionary is gone and that the entire new
dictionary has been added. This will cause the report to show a change for the entire, and potentially large,
dictionary, not just of the any changed nested value(s).
Required packages
To run jobs with this differ, you need to first install additional Python packages as follows:
uv pip install --upgrade webchanges[deepdiff]
Changed in version 3.30: Added support for YAML data.
Changed in version 3.30.1: Added compact sub-directive.
image
Added in version 3.21: As BETA.
Note
This differ is currently in BETA, mostly because it’s unclear what more needs to be developed, changed or parametrized in order to make the differ work with the vast variety of images. Feedback welcomed here.
Highlights changes in an image by overlaying them in yellow on a greyscale version of the original image. Only works with HTML reports.
Examples
Monitor a URL of an image directly, and see if the image changes:
url: https://sources.example.net/productimage.jpg
filters:
- ascii85
differ:
name: image
data_type: ascii85
Extract an image URL from an HTML <img> tag and monitor if this URL changes:
url: https://www.example.net/productpage.html
filters:
- xpath: //div[@class="image"]/img/@src
differ:
name: image
data_type: url
Optional directives
This differ is currently in BETA and these directives may change in the future.
data_type(url,filename,ascii85orbase64): The type of data to process: a link to the image, the path to the file containing the image, or the image itself encoded as Ascii85 or RFC 4648 Base_64 text (default:url).mse_threshold(float): The minimum mean squared error (MSE) between two images to consider them changed; requires the packagenumpyto be installed (default: 2.5).
Note
If you pass a url or filename to the differ, it will detect changes only if the url or
filename changes, not if the image behind the url/filename does; no change will be reported if the url or filename
changes but the image doesn’t. To detect changes in an image when the url or filename doesn’t change, build a job
that captures the image itself encoded in Ascii85 (preferably, see the ascii85 filter) or Base64 and set
data_type accordingly.
Required packages
To run jobs with this differ, you need to first install additional Python packages as follows:
uv pip install --upgrade webchanges[imagediff]
In addition, you can only run it with a default configuration of :program:webchanges:, which installs the
httpx HTTP client library; requests is not supported.
table
Similar to unified, it performs a line-by-line comparison and reports lines that have been added, deleted, or changed. However, this is reported in an HTML table format showing a side by side, line by line comparison of text with inter-line and intra-line change highlights produced by Python’s difflib.HtmlDiff class.
Example
url: https://example.net/table.html
differ: table
Output:
For backwards compatibility, this is the default differ for an html reporter with the configuration setting
diff (deprecated) set to html.
Optional directives
tabsize(int): Tab stop spacing (default: 8).
Changed in version 3.21: Became a standalone differ (previously only accessible through configuration file settings).
Added the tabsize directive.
wdiff
Added in version 3.24.
Performs a word-by-word comparison highlighting words that have been added (added) or deleted (deleted). Changed words are displayed twice: once marked as “deleted” (deleted) representing the old word(s), and the new word(s) as “added” (added). Line breaks are maintained.
It is similar to GNU’s Wdiff, but requires no external dependency.
When unchanged lines are skipped, they are reported using @@. For example, @@ 1...22 @@ means that lines 1 to
22 are skipped from the report as they are unchanged.
Example
command: The time now is %time% UTC # Windows
differ: wdiff
Output:
+++ @ Thu, 01 Oct 2020 01:00:00 +0000
The time now is 00:00:00.00 01:00:00.00 UTC
Checked 1 source in 0.1 seconds with webchanges.
Optional directives
context_lines(int): The number of context lines on each side of changes to provide surrounding content to better understand the changes (default: 3).range_info(true/false): Include range information lines for unreported lines (default: true).