Command line arguments

Overview

usage: webchanges [-h] [-V] [-v] [--log-file FILE] [--jobs FILE] [--config FILE] [--hooks FILE]
                  [--database FILE] [--list-jobs [REGEX]] [--errors [REPORTER]] [--test [JOB]]
                  [--no-headless] [--test-differ JOB [JOB ...]] [--dump-history JOB]
                  [--max-workers WORKERS] [--test-reporter REPORTER] [--smtp-login]
                  [--telegram-chats] [--xmpp-login] [--footnote FOOTNOTE] [--edit-jobs]
                  [--edit-config] [--edit-hooks] [--gc-database [RETAIN_LIMIT]]
                  [--clean-database [RETAIN_LIMIT]] [--rollback-database TIMESTAMP]
                  [--delete-snapshot JOB] [--prepare-jobs] [--change-location JOB NEW_LOCATION]
                  [--check-new] [--install-chrome] [--features] [--detailed-versions]
                  [--database-engine DATABASE_ENGINE] [--max-snapshots NUM_SNAPSHOTS]
                  [JOB(S) ...]

Checks web content, including images, to detect any changes since the prior run. If any are found, it
summarizes (including with Gen AI) what changed ('diff') and displays it and/or sends it via email
and/or other supported services. Can check the output of local commands as well.

positional arguments:
  JOB(S)                JOB(S) to run (index number(s) as per --list; if one also URL/command)
                        (default: run all jobs)

options:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -v, --verbose         show logging output; use -vv or -vvv for more verbosity
  --log-file FILE       send log to FILE

override file defaults:
  --jobs FILE           read job list (URLs/commands) from FILE or files matching a glob pattern
  --config FILE         read configuration from FILE
  --hooks FILE          use FILE or files matching a glob pattern as hooks.py module to import
  --database FILE       use FILE as snapshots database; FILE can be a redis URI

job management:
  --list-jobs [REGEX]   list jobs and their index number (optional: only those who match REGEX)
  --errors [REPORTER]   test-run enabled jobs and list those with errors or no data captured;
                        optionally send output to REPORTER
  --test [JOB]          test a JOB (by index or URL/command) and show filtered output; if no JOB,
                        check config and jobs file(s) syntax for errors
  --no-headless         turn off browser headless mode (for jobs using a browser)
  --test-differ JOB [JOB ...]
                        show diff(s) using existing saved snapshots of a JOB (by index or
                        URL/command); can be combined with --test-reporter
  --dump-history JOB    print all saved changed snapshots for a JOB (by index or URL/command)
  --max-workers WORKERS
                        maximum number of parallel threads

reporters:
  --test-reporter REPORTER
                        test the REPORTER or redirect output of --test-differ
  --smtp-login          verify SMTP login credentials with server (and enter or check password if
                        using keyring)
  --telegram-chats      list telegram chats webchanges is joined to
  --xmpp-login          enter or check password for XMPP (stored in keyring)
  --footnote FOOTNOTE   FOOTNOTE text (quoted text)

launch editor ($EDITOR/$VISUAL):
  --edit-jobs           edit jobs (URL/command) list
  --edit-config         edit configuration file
  --edit-hooks          edit hooks script

database:
  --gc-database [RETAIN_LIMIT]
                        garbage collect the database: remove all snapshots of jobs not listed in the
                        jobs file and keep only the latest RETAIN_LIMIT snapshots for remaining jobs
                        (default: 1)
  --clean-database [RETAIN_LIMIT]
                        clean up the database by keeping only the latest RETAIN_LIMIT snapshots
                        (default: 1)
  --rollback-database TIMESTAMP
                        delete changed snapshots added since TIMESTAMP
  --delete-snapshot JOB
                        delete the last saved changed snapshot of JOB (index or URL/command)
  --prepare-jobs        run newly added jobs (i.e. those without snapshots)
  --change-location JOB NEW_LOCATION
                        change the location of an existing JOB (index or URL/command)

miscellaneous:
  --check-new           check if a new release is available
  --install-chrome      install or update Google Chrome browser
  --features            list supported job kinds, filters and reporters (including those loaded from
                        hooks.py)
  --detailed-versions   list detailed versions including those of installed dependencies

override configuration file:
  --database-engine DATABASE_ENGINE
                        override database engine to use
  --max-snapshots NUM_SNAPSHOTS
                        override maximum number of changed snapshots to retain in database (sqlite3
                        only)

Full documentation is at https://webchanges.readthedocs.io/

Select subset of jobs

Add job index number(s) (a “joblist”) to the command line to run a subset of jobs; for example, webchanges 2 3 9 will only run jobs #2, #3, and #9, and webchanges -1 will only run the last job. Find the index numbering of your jobs by running webchanges --list.

Added in version 3.6.

Changed in version 3.8: Accepts negative indices.

Custom job file specification

By default, the job file is named jobs.yaml and is located in the following directory:

  • Linux: ~/.config/webchanges

  • macOS: ~/Library/Preferences/webchanges

  • Windows: %USERPROFILE%\Documents\webchanges (the webchanges folder within your Documents folder)

Use the --jobs command line argument to specify a file with a different name or location or a glob pattern for multiple files (the contents of matching files will be combined)

  • If you specify a file name without a directory, webchanges searches: 1. The current directory 2. The default directory (if not found in the current directory)

  • If you specify a file name without a suffix and the file is not found, webchanges will attempt to load a file with the .yaml suffix.

  • If you specify a file name that does not start with jobs, webchanges will attempt to load a file with a jobs- prefix (and one with both a jobs- prefix and a .yaml suffix).

    Example: --jobs test is equivalent to --jobs test or --jobs test.yaml or --jobs jobs-test or --jobs jobs-test.yaml and the first matching file will be loaded.

Multiple job files

To load multiple job files in a single run, glob patterns are supported, as well as repeating the --jobs argument:

webchanges --jobs file1.yaml --jobs file2.yaml --jobs morningrun/*.yaml

The contents of all specified files will be combined by appending them in order.

Changed in version 3.25: Added ability to repeat the argument multiple times.

Smart file specification

The command-line arguments --jobs, --config, and --hooks feature a “smart file specification” capability. This allows you to provide a shorthand name for your files, and webchanges will automatically search for several variations of that name.

The search process is as follows:

  • For --jobs <name>: #. <name> #. <name>.yaml #. jobs-<name> #. jobs-<name>.yaml

    Example: --jobs myjobs will look for myjobs, then myjobs.yaml, jobs-myjobs, and finally jobs-myjobs.yaml.

  • For --config <name>: #. <name> #. <name>.yaml #. config-<name> #. config-<name>.yaml

    Example: --config myconfig will look for myconfig, then myconfig.yaml, config-myconfig, and finally config-myconfig.yaml.

  • For --hooks <name>: #. <name> #. <name>.py #. hooks-<name> #. hooks-<name>.py

    Example: --hooks myhooks will look for myhooks, then myhooks.py, hooks-myhooks, and finally hooks-myhooks.py.

Changed in version 3.31: Added prefix and expanded from --jobs to also include --config and --hooks.

List jobs and their index number

You can list all the jobs in a jobs file by using the command line argument --list. You can filter this list by following this argument with a Python regular expression. For example --list blue will list only jobs that have the word ‘blue’ in their listing name (but not ‘BLUE’), while --list (?i)blue will do the same but in a case-insensitive manner.

Changed in version 3.25: Added ability to filter list using a RegEx.

Show errors and no-data jobs

You can run all jobs and see those that result in an error or who, after filtering, return no data, by running webchanges with the --error command line argument. This can help with detecting jobs that may no longer be monitoring resources as expected. No snapshots are saved from this run.

To restrict the check to a subset of jobs, append a “joblist” of job index numbers (as per --list) and/or URLs/commands to the command line; for example, webchanges --errors 2 5 9 will check only jobs 2, 5 and 9. Without a joblist, all enabled jobs are checked.

Warning

Do not use this argument to test newly modified jobs since it does conditional requests on websites, and those reporting no changes since the time webchanges saved a snapshot are skipped. Use --test instead. To remove a blank snapshot use --delete-snapshot; to see the saved snapshots use --dump-history.

By default, the output will go to stdout, but you can add any reporter name to the command line argument to have the output use that reporter. For example, to be notified by email of any errors, run the following:

webchanges --error email

Please note that since no reporting is involved, --error runs faster than a regular run and this has been known to cause DNS errors (e.g. [Errno-3] Try again) when using a slow resolver (see here). To reduce this (and other) errors, --max_workers is defaulted to 1 (no parallel job execution).

Changed in version 3.17: Send output to any reporter.

Changed in version 3.18: Use conditional requests to improve speed.

Changed in version 3.31: Default --max-workers to 1 to reduce spurious errors.

Changed in version 3.36: Honor a “joblist” of positional JOB(S) arguments to restrict the check to a subset of jobs.

Test run a job or check config and job files for errors

You can test a job and its filter by using the command line argument --test followed by the job index number (from --list) or its URL/command; webchanges will display the filtered output. This allows to easily test changes in filters. Use a negative index number to select a job from the bottom of your job list (i.e. -1 is the last job, -2 is the second to last job, etc.).

Combine --test with --verbose to get more information, for example the text returned from a website with a 4xx (client error) status code, or, if using use_browser: true, save to a temporary folder in case of failure a screenshot, a full page image, and the HTML contents of the page (see log for filenames):

webchanges --verbose --test 1

The output of the test can be redirected to any reporter by combining it with –test-reporter:

webchanges --verbose --test 1 --test-reporter browser

Please note that max_tries will be ignored by --test.

To only check the config, job and hooks files for errors, use --test without specifying a JOB:

webchanges --test

Changed in version 3.8: Accepts negative indices.

Changed in version 3.10.2: JOB no longer required (will only check the config and job files for errors).

Changed in version 3.11: When JOB is not specified, the hooks file is also checked for syntax errors (in addition to the config and jobs files).

Changed in version 3.14: Saves the screenshot, full page image and HTML contents when a url job with use_browser: true fails while running in verbose mode.

Changed in version 3.27: Uses built-in repoters for output, and can be combined with --test-reporter.

Show diff from saved snapshots

You can use the command line argument --test-differ followed by the job index number (from --list) or its URL/command will display diffs and apply the diff filters currently defined from all snapshots that have been saved; obviously a minimum of 2 saved snapshots are required. This allows you to test the effect of a diff filter and/or retrieve historical diffs (changes). Use a negative index number to select a job from the bottom of your job list (i.e. -1 is the last job, -2 is the second to last job, etc.)

You can test how the diff looks like with a reporter by combining this with --test-reporter. For example, to see how diffs from job 1 look like in HTML if running on a machine with a web browser, run this:

webchanges --test-differ 1 --test-reporter browser

Optionally, you can specify the maximum number of comparisons (diffs) to run, instead of producing diffs for all the snapshots that have been saved:

webchanges --test-differ 1 2 --test-reporter browser  # run differ for job 1 a maximum of 2 times

Changed in version 3.3: Will now display all saved snapshots instead of only the latest 10.

Changed in version 3.8: Accepts negative indices.

Changed in version 3.9: Can be used in combination with --test-reporter.

Changed in version 3.22: Added the maximum number of comparisons to perform (optional).

Test a reporter

You can test a reporter by using the command line argument --test-reporter followed by the reporter name; webchanges will create a dummy report and send it through the selected reporter. This will help in debugging issues, especially when used in conjunction with -vv or -vvv:

webchanges -vv --test-reporter telegram

When --test-reporter is followed by one or more job indexes (a positional joblist) it instead overrides the configured reporters for that run: the listed jobs are fetched, filtered, and diffed against the snapshot database, and the resulting report is sent to the named reporter only — no other reporter (e.g. email) fires. The snapshot database is not written to (read-only), so this is safe to use to preview what a reporter would emit:

webchanges --test-reporter stdout 1      # run job 1 and route its report to stdout only
webchanges --test-reporter telegram 1 3  # run jobs 1 and 3 and route to telegram only

Changed in version 3.9: Can be used in combination with --test-differ to redirect the output of the diff to a reporter.

Changed in version NEXT: When combined with a positional joblist, --test-reporter runs those jobs (read-only; no snapshots saved) and routes the report to the named reporter only, overriding the reporters enabled in the configuration.

Add a footnote to your reports

You can use the command line argument --footnote to add a footnote to the reports:

webchanges --footnote "This report was made by me."

Added in version 3.13.

Compact the database

You can compact the snapshots database by running webchanges with either the --gc-database (‘garbage collect’) or --clean-database command line argument.

Running with --gc-database will purge all snapshots of jobs that are no longer in the jobs file and, for those in the jobs file, older changed snapshots other than the most recent one for each job. It will also rebuild (and therefore defragment) the database using SQLite’s VACUUM command. You can indicate a RETAIN_LIMIT for the number of older changed snapshots to retain (default: 1, the latest).

Tip

If you use multiple jobs files, use --gc-database in conjunction with a glob --jobs command, e.g. webchanges --jobs "jobs*.yaml" --gc-database. To ensure that the glob is correct, run e.g. webchanges --jobs "jobs*.yaml" --list.

Running with --clean-database will remove all older snapshots keeping the most recent RETAIN_LIMIT ones for each job (whether it is still present in the jobs file or not) and rebuild (and therefore defragment) the database using SQLite’s VACUUM command.

Changed in version 3.11: Renamed from --gc-cache and --clean-cache.

Changed in version 3.13: Added RETAIN_LIMIT.

Rollback the database

You can rollback the snapshots database to an earlier time by running webchanges with the --rollback-database command line argument followed by either an ISO-8601 formatted date or a Unix timestamp indicating the point in time you want to go back to. If you have the Python library dateutil installed in the system (not a dependency of webchanges), then you can use any string recognized by dateutil.parser, including date only, time only, date and time, etc. See examples here.

Useful when you missed notifications or they got lost: rollback the database to the time of the last good report, then run webchanges again to get a new report with the differences since that time.

You can find multiple sites that calculate Unix time for you, such as www.unixtimestamp.com

Warning

All snapshots captured after the timestamp are permanently deleted. This deletion is irreversible. Do back up the database file before doing a rollback in case of a mistake (or fat-finger).

This feature does not work with database engines redis, textfiles or minidb.

Added in version 3.2.

Changed in version 3.11: Renamed from --rollback-cache.

Changed in version 3.24: Recognizes ISO-8601 formats and defaults to using dateutil.parser if found installed.

Save snapshot for newly added job

To run only newly added jobs to capture and save their initial snapshot, run with --prepare-jobs. Can be combined with a “joblist” on the command line, in which case it will add these new jobs to the joblist provided. The following will run jobs #10 and #12 plus any jobs that have never been run before (i.e. no starting snapshot has ever been captured):

webchanges --prepare-jobs 10 12

Added in version 3.27.

Changed in version 3.28: Added ability to combine with joblist.

Delete the latest saved snapshot

You can delete the latest saved snapshot of a job by running webchanges with the --delete-snapshot command line argument followed by the job index number (from --list) or its URL/command. This is extremely useful when a website is redesigned and your filters behave in unexpected ways (for example, by capturing nothing):

  • Update your filters to once again capture the content you’re monitoring, testing the job by running webchanges with the --test command line argument (see here);

  • Delete the latest job’s snapshot using --delete-snapshot;

  • Run webchanges again; this time the diff report will contain useful information on whether any content has changed.

This feature does not work with database engines textfiles and minidb.

Added in version 3.5.

Changed in version 3.8: Also works with redis database engine.

Updating a URL and keeping past history

Job history is stored based on the value of the url or command parameter, so updating a job’s URL in the configuration file urls.yaml will create a new job with no history. Retain history by using --change-location, before modifying the jobs file (i.e. while the job is still listed with the old URL or command):

webchanges --change-location https://example.org#old https://example.org#new
# or
webchanges --change-location old_command new_command

Added in version 3.13.

Database engine

--database-engine will override the value in the configuration file (see Database engine).

Added in version 3.2.

Maximum number of snapshots to save

--max-snapshots will override the value in the configuration file (see max_snapshots).

Added in version 3.3: For default sqlite3 database engine only.

Log -v/–verbose output to file

Use --log-file to send the log output from -v, -vv, or -vvv to a file:

webchanges -vv --log-file webchanges.log

Added in version 3.27.