Command line arguments

usage: webchanges [-h] [-V] [-v] [--jobs FILE] [--config FILE] [--hooks FILE]
                  [--database FILE] [--list-jobs] [--errors [REPORTER]]
                  [--test [JOB]] [--no-headless] [--test-differ JOB [JOB ...]]
                  [--dump-history JOB] [--max-workers WORKERS]
                  [--test-reporter REPORTER] [--smtp-login] [--telegram-chats]
                  [--xmpp-login] [--footnote FOOTNOTE] [--edit]
                  [--edit-config] [--edit-hooks]
                  [--gc-database [RETAIN_LIMIT]]
                  [--clean-database [RETAIN_LIMIT]]
                  [--rollback-database TIMESTAMP] [--delete-snapshot JOB]
                  [--change-location JOB NEW_LOCATION] [--check-new]
                  [--install-chrome] [--features] [--detailed-versions]
                  [--database-engine DATABASE_ENGINE]
                  [--max-snapshots NUM_SNAPSHOTS] [--add JOB] [--delete JOB]
                  [JOB(S) ...]

Checks web content, including images, to detect any changes since the prior
run. If any are found, it summarizes (including with Gen AI) what changed
('diff') and displays it and/or sends it via email and/or other supported
services. Can check the output of local commands as well.

positional arguments:
  JOB(S)                JOB(S) to run (if one, index as per --list or
                        URL/command, if multiple, by index) (default: run all
                        jobs)

options:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -v, --verbose         show logging output; use -vv for maximum verbosity

override file defaults:
  --jobs FILE, --urls FILE
                        read job list (URLs/commands) from FILE or files
                        matching a glob pattern
  --config FILE         read configuration from FILE
  --hooks FILE          use FILE as hooks.py module to import
  --database FILE, --cache FILE
                        use FILE as snapshots database; FILE can be a redis
                        URI

job management:
  --list-jobs           list jobs and their index number
  --errors [REPORTER]   test run all jobs and list those with errors or no
                        data captured; optionally send output to REPORTER
  --test [JOB], --test-filter [JOB]
                        test a JOB (by index or URL/command) and show filtered
                        output; if no JOB, check syntax of config and jobs
                        file(s)
  --no-headless         turn off browser headless mode (for jobs using a
                        browser)
  --test-differ JOB [JOB ...], --test-diff JOB [JOB ...], --test-diff-filter JOB [JOB ...]
                        show diff(s) using existing saved snapshots of a JOB
                        (by index or URL/command)
  --dump-history JOB    print all saved changed snapshots for a JOB (by index
                        or URL/command)
  --max-workers WORKERS
                        maximum number of parallel threads (WORKERS)

reporters:
  --test-reporter REPORTER
                        test the REPORTER or redirect output of --test-differ
  --smtp-login          verify SMTP login credentials with server (and enter
                        or check password if using keyring)
  --telegram-chats      list telegram chats webchanges is joined to
  --xmpp-login          enter or check password for XMPP (stored in keyring)
  --footnote FOOTNOTE   FOOTNOTE text (quoted text)

launch editor ($EDITOR/$VISUAL):
  --edit                edit job (URL/command) list
  --edit-config         edit configuration file
  --edit-hooks          edit hooks script

database:
  --gc-database [RETAIN_LIMIT], --gc-cache [RETAIN_LIMIT]
                        garbage collect the database: remove all snapshots of
                        jobs not listed in the jobs file and keep only the
                        latest RETAIN_LIMIT snapshots for remaining jobs
                        (default: 1)
  --clean-database [RETAIN_LIMIT], --clean-cache [RETAIN_LIMIT]
                        clean up the database by keeping only the latest
                        RETAIN_LIMIT snapshots (default: 1)
  --rollback-database TIMESTAMP, --rollback-cache TIMESTAMP
                        delete changed snapshots added since TIMESTAMP (backup
                        the database before using!)
  --delete-snapshot JOB
                        delete the last saved changed snapshot of JOB (index
                        or URL/command)
  --change-location JOB NEW_LOCATION
                        change the location of an existing JOB (index or
                        URL/command)

miscellaneous:
  --check-new           check if a new release is available
  --install-chrome      install or update Google Chrome browser
  --features            list supported job kinds, filters and reporters
                        (including those loaded from hooks.py)
  --detailed-versions   list detailed versions including those of installed
                        dependencies

override configuration file:
  --database-engine DATABASE_ENGINE
                        override database engine to use
  --max-snapshots NUM_SNAPSHOTS
                        override maximum number of changed snapshots to retain
                        in database (sqlite3 only)

deprecated:
  --add JOB             add a job (key1=value1,key2=value2,...) [use --edit
                        instead]
  --delete JOB          delete a job (by index or URL/command) [use --edit
                        instead]

Full documentation is at https://webchanges.readthedocs.io/

Select subset of jobs

Add job number(s) (a joblist) to the command line to run a subset of jobs; for example, webchanges 2 3 9 will only run jobs #2, #3, and #9, and webchanges -1 will only run the last job. Find the index numbering of your jobs by running webchanges --list. API is experimental and may change in the near future.

Added in version 3.6.

Changed in version 3.8: Accepts negative indices.

Show errors and no-data jobs

You can run all jobs and see those that result in an error or who, after filtering, return no data, by running webchanges with the --error command line argument. This can help with detecting jobs that may no longer be monitoring resources as expected. No snapshots are saved from this run.

Warning

Do not use this argument to test newly modified jobs since it does conditional requests on websites, and those reporting no changes since the time webchanges saved a snapshot are skipped. Use --test instead. To remove a blank snapshot use --delete-snapshot; to see the saved snapshots use --dump-history.

By default, the output will go to stdout, but you can add any reporter name to the command line argument to have the output use that reporter. For example, to be notified by email of any errors, run the following:

webchanges --errors email

Changed in version 3.17: Send output to any reporter.

Changed in version 3.18: Use conditional requests to improve speed.

Test run a job or check config and job files for errors

You can test a job and its filter by using the command line argument --test followed by the job index number (from --list) or its URL/command; webchanges will display the filtered output. This allows to easily test changes in filters. Use a negative index number to select a job from the bottom of your job list (i.e. -1 is the last job, -2 is the second to last job, etc.). Combine --test with --verbose to get more information, for example the text returned from a website with a 4xx (client error) status code, or, if using use_browser: true, a screenshot, a full page image, and the HTML contents at the moment of failure (see log for filenames):

webchanges --verbose --test 1

Please note that max_tries will be ignored by --test.

To only check the config, job and hooks files for errors, use --test without a JOB:

webchanges --test

Changed in version 3.8: Accepts negative indices.

Changed in version 3.10.2: JOB no longer required (will only check the config and job files for errors).

Changed in version 3.11: When JOB is not specified, the hooks file is also checked for syntax errors (in addition to the config and jobs files).

Changed in version 3.14: Saves the screenshot, full page image and HTML contents when a url job with use_browser: true fails while running in verbose mode.

Show diff from saved snapshots

You can use the command line argument --test-differ followed by the job index number (from --list) or its URL/command will display diffs and apply the diff filters currently defined from all snapshots that have been saved; obviously a minimum of 2 saved snapshots are required. This allows you to test the effect of a diff filter and/or retrieve historical diffs (changes). Use a negative index number to select a job from the bottom of your job list (i.e. -1 is the last job, -2 is the second to last job, etc.)

You can test how the diff looks like with a reporter by combining this with --test-reporter. For example, to see how diffs from job 1 look like in HTML if running on a machine with a web browser, run this:

webchanges --test-differ 1 --test-reporter browser

Optionally, you can specify the maximum number of comparisons (diffs) to run, instead of producing diffs for all the snapshots that have been saved:

webchanges --test-differ 1 2 --test-reporter browser  # run differ for job 1 a maximum of 2 times

Changed in version 3.3: Will now display all saved snapshots instead of only the latest 10.

Changed in version 3.8: Accepts negative indices.

Changed in version 3.9: Can be used in combination with --test-reporter.

Changed in version 3.22: Added the maximum number of comparisons to perform (optional).

Test a reporter

You can test a reporter by using the command line argument --test-reporter followed by the reporter name; webchanges will create a dummy report and send it through the selected reporter. This will help in debugging issues, especially when used in conjunction with -vv:

webchanges -vv --test-reporter telegram

Changed in version 3.9: Can be used in combination with --test-differ to redirect the output of the diff to a reporter.

Add a footnote to your reports

You can use the command line argument --footnote to add a footnote to the reports:

webchanges --footnote "This report was made by me."

Added in version 3.13.

Updating a URL and keeping past history

Job history is stored based on the value of the url or command parameter, so updating a job’s URL in the configuration file urls.yaml will create a new job with no history. Retain history by using --change-location, before modifying the jobs file (i.e. while the job is still listed with the old URL or command):

webchanges --change-location https://example.org#old https://example.org#new
# or
webchanges --change-location old_command new_command

Added in version 3.13.

Delete the latest saved snapshot

You can delete the latest saved snapshot of a job by running webchanges with the --delete-snapshot command line argument followed by the job index number (from --list) or its URL/command. This is extremely useful when a website is redesigned and your filters behave in unexpected ways (for example, by capturing nothing):

  • Update your filters to once again capture the content you’re monitoring, testing the job by running webchanges with the --test command line argument (see here);

  • Delete the latest job’s snapshot using --delete-snapshot;

  • Run webchanges again; this time the diff report will contain useful information on whether any content has changed.

This feature does not work with database engines textfiles and minidb.

Added in version 3.5.

Changed in version 3.8: Also works with redis database engine.

Rollback the database

You can rollback the snapshots database to an earlier time by running webchanges with the --rollback-database command line argument followed by a Unix timestamp indicating the point in time you want to go back to. Useful when you missed notifications or they got lost: rollback the database to the time of the last good report, then run webchanges again to get a new report with the differences since that time.

You can find multiple sites that calculate Unix time for you, such as www.unixtimestamp.com

Warning

All snapshots captured after the timestamp are permanently deleted. This deletion is irreversible. Do back up the database file before doing a rollback in case of a mistake (or fat-finger).

This feature does not work with database engines redis, textfiles or minidb.

Added in version 3.2.

Changed in version 3.11: Renamed from --rollback-cache.

Compact the database

You can compact the snapshots database by running webchanges with either the --gc-database (‘garbage collect’) or --clean-database command line argument.

Running with --gc-database will purge all snapshots of jobs that are no longer in the jobs file and, for those in the jobs file, older changed snapshots other than the most recent one for each job. It will also rebuild (and therefore defragment) the database using SQLite’s VACUUM command. You can indicate a RETAIN_LIMIT for the number of older changed snapshots to retain (default: 1, the latest).

Tip

If you use multiple jobs files, use --gc-database in conjunction with a glob --jobs command, e.g. webchanges --jobs "jobs*.yaml" --gc-database. To ensure that the glob is correct, run e.g. webchanges --jobs "jobs*.yaml" --list.

Running with --clean-database will remove all older snapshots keeping the most recent RETAIN_LIMIT ones for each job (whether it is still present in the jobs file or not) and rebuild (and therefore defragment) the database using SQLite’s VACUUM command.

Changed in version 3.11: Renamed from --gc-cache and --clean-cache.

Changed in version 3.13: Added RETAIN_LIMIT.

Database engine

--database-engine will override the value in the configuration file (see Default database engine).

Added in version 3.2.

Maximum number of snapshots to save

--max-snapshots will override the value in the configuration file (see Maximum number of snapshots to save).

Added in version 3.3: For default sqlite3 database engine only.