Why Python is a perfect language for Kibitzr
Kibitzr is a command line utility, that (depending on configuration) polls web pages and notify on changes through different channels. Kibitzr is written in Python and the first version was built within an hour. This speed of development is achieved by Python nature itself and it’s ecosystem.
Python being interpreted language is known for being not the most performant choice. On the other hand, most of its internals (and some of the third-party libraries) are written in pure C.
Kibitzr uses Python to glue these highly optimized pieces together, achieving both runtime speed and development agility.
Configuration
Kibitzr’s configuration file has to be expressive and human-editable. That’s why it uses YAML with awesome PyYAML Python package.
While current version is more complex and has some post-processing, the first one was just:
import yaml
with open("kibitzr.yml") as fp:
conf = yaml.load(fp)
Command line interface
Writing argument parser from scratch can be tough. That’s why Python has two built-in implementations of it: argparse and optparse. But Kibitzr uses Click instead. So whole argument parsing magic is wrapped inside a stack of decorators:
import click
@click.command()
@click.option("--once", is_flag=True,
help="Run checks once and exit")
@click.option("-l", "--log-level", default="info",
type=click.Choice(LOG_LEVEL_CODES.keys()),
help="Logging level")
@click.argument('name', nargs=-1)
def entry(once, log_level, name):
...
To make kibitzr an executable, Python’s built-in entry points are used:
setup(
...
entry_points={
'console_scripts': [
'kibitzr=kibitzr.cli:entry'
]
},
...
)
Schedule
Kibitzr checks are recurrent. An excellent schedule allows building schedule with
schedule.every(period).seconds.do(checker.check)
And execute them with
schedule.run_pending()
HTTP
While it’s an ubiquitous task, making HTTP request can be cumbersome. Just think about handling redirects, handling different network failures, content encoding… Thanks to requests it’s all done in one line:
response = requests.get(self.url)
JavaScript
Having raw HTML contents is good enough sometimes. But not always for sure. Modern websites might not even open without some JavaScript. That’s why kibitzr comes with powerfull browser automation library - Selenium which launches headless Firefox:
with firefox(headless) as driver:
driver.get(url)
if scenario:
run_scenario(driver, scenario)
if delay:
time.sleep(delay)
One can do all kind of browser interactions, like authenticating, filling forms and clicking buttons.
HTML processing
Of course operating with full HTML page is uncomfortable. Often required information is hidden inside one small tag. Kibitzr leverages powerfull lxml and handy Beautiful Soup for cropping contents to just one CSS Selector, XPath, or tag. Than kibitzr strips all HTML markup out leaving the only plain text.
JSON
In the golden age of APIs, kibitzr would be incomplete without sophisticated JSON processing support, provided by jq.
Parsing latest release version and title from GitHub API is done by succinct transform:
- jq: .tag_name + " " + .name
Notifications
Python comes with SMTP support out of the box. Whereas all modern instant messengers (like Slack and Gitter) have API web hooks, accessible through the very same requests library.
But providing all possible integrations is not a viable option these days, that’s why kibitzr allows building custom notifiers right inside the configuration file with a plain bash of Python snippets.
Storing Changes History
There are many possible ways of persisting changes history. Following UNIX philosophy, kibitzr uses git with cozy wrapper - sh:
sh.git.add('-A', '.')
sh.git.commit('-m', self.commit_msg)