Kibitzr

Personal Web Assistant


Why Python is a perfect language for Kibitzr

Kibitzr is a command line utility, that (depending on configuration) polls web pages and notify on changes through different channels. Kibitzr is written in Python and the first version was built within an hour. This speed of development is achieved by Python nature itself and it’s ecosystem.

Python being interpreted language is known for being not the most performant choice. On the other hand, most of its internals (and some of the third-party libraries) are written in pure C.

Kibitzr uses Python to glue these highly optimized pieces together, achieving both runtime speed and development agility.

Configuration

Kibitzr’s configuration file has to be expressive and human-editable. That’s why it uses YAML with awesome PyYAML Python package.

While current version is more complex and has some post-processing, the first one was just:

import yaml
with open("kibitzr.yml") as fp:
    conf = yaml.load(fp)

Command line interface

Writing argument parser from scratch can be tough. That’s why Python has two built-in implementations of it: argparse and optparse. But Kibitzr uses Click instead. So whole argument parsing magic is wrapped inside a stack of decorators:

import click

@click.command()
@click.option("--once", is_flag=True,
              help="Run checks once and exit")
@click.option("-l", "--log-level", default="info",
              type=click.Choice(LOG_LEVEL_CODES.keys()),
              help="Logging level")
@click.argument('name', nargs=-1)
def entry(once, log_level, name):
    ...

To make kibitzr an executable, Python’s built-in entry points are used:

setup(
    ...
    entry_points={
        'console_scripts': [
            'kibitzr=kibitzr.cli:entry'
        ]
    },
    ...
)

Schedule

Kibitzr checks are recurrent. An excellent schedule allows building schedule with

schedule.every(period).seconds.do(checker.check)

And execute them with

schedule.run_pending()

HTTP

While it’s an ubiquitous task, making HTTP request can be cumbersome. Just think about handling redirects, handling different network failures, content encoding… Thanks to requests it’s all done in one line:

response = requests.get(self.url)

JavaScript

Having raw HTML contents is good enough sometimes. But not always for sure. Modern websites might not even open without some JavaScript. That’s why kibitzr comes with powerfull browser automation library - Selenium which launches headless Firefox:

with firefox(headless) as driver:
    driver.get(url)
    if scenario:
        run_scenario(driver, scenario)
    if delay:
        time.sleep(delay)

One can do all kind of browser interactions, like authenticating, filling forms and clicking buttons.

HTML processing

Of course operating with full HTML page is uncomfortable. Often required information is hidden inside one small tag. Kibitzr leverages powerfull lxml and handy Beautiful Soup for cropping contents to just one CSS Selector, XPath, or tag. Than kibitzr strips all HTML markup out leaving the only plain text.

JSON

In the golden age of APIs, kibitzr would be incomplete without sophisticated JSON processing support, provided by jq.

Parsing latest release version and title from GitHub API is done by succinct transform:

  - jq: .tag_name + " " + .name

Notifications

Python comes with SMTP support out of the box. Whereas all modern instant messengers (like Slack and Gitter) have API web hooks, accessible through the very same requests library.

But providing all possible integrations is not a viable option these days, that’s why kibitzr allows building custom notifiers right inside the configuration file with a plain bash of Python snippets.

Storing Changes History

There are many possible ways of persisting changes history. Following UNIX philosophy, kibitzr uses git with cozy wrapper - sh:

sh.git.add('-A', '.')
sh.git.commit('-m', self.commit_msg)