WikiCrawl - Demo on Web Automation w/ Selenium and Python3
This is a quick sample project showing the power of web automation. There are a few routes to proceed to attempt web automation, ranging from low-level (curl, Mechanize or similar) to higher-level (Selenium which typically runs a full browser instance). I recommend high-level to interact with modern, dynamic webpages which need to execute Javascript on the page. What I've shown is a pattern to deal with state management across different pages of the app. Each page has their own controls and variables and can pass to different pages as you click links. Typical use cases include logging in past authorization, navigating through menus, inputting and verifying form entry, or extracting data out of a website.
This particular project uses my app_skellington library to provide CLI-menu and configuration through ConfigObj. More developed options on the Python side include Typer, Click, or Baker.
I believe the nodejs implementation of Selenium is asynchronous. If performance is critical, implementing there may be a good option.
This is a minimally-developed app, just shown as a quick demo/starter for novices interested in web automation. For example, the datalayer is just a placeholder. Future state, I was interested to save the results in a local sqlite3 db but stopped working on it.
Installation
Typical python project. Either activate a virtualenv or install third-party dependencies into system-wide Python environment.
python ./setup.py install .. or manually .. pip install app_skellington pip install configobj pip install colorlog pip install selenium==3.141.0
Usage
python ./road2philosophy.py -h python ./road2philosophy.py open_browser python ./road2philosophy.py play_single python ./road2philosophy.py play_multiple