Trying xdotool and pyautogui

I tried xdotool and it’s pretty cool. I was automating setting up my Slack status. It finds the Slack window, focuses it, clicks here and there and — tah-dah! — my status is updated. It totally depends on things being in fixed places, but it works.

Then I tried pyautogui planning to port this script and see how it compares with xdotool, but I was surprised right from the start to discover pyautogui doesn’t have any feature to find or operate on windows. So I’ve decided to stick with xdotool and dive deeper. I might try pyautogui some other time.

One thing I liked about pyautogui is it’s “failsafe” — if you slam the mouse and move the cursor to any screen corner, pyautogui will stop running. This is a great feature. One automation of mine with xdotool didn’t quite work (the popup appeared too late) and suddenly xdotool was blindly doing stuff on the wrong places and I couldn’t stop it. No damage was done, but it was scary.

I totally needed something like pyautogui’s, but this required a higher-level language. I went with Ruby. My failsafe is the top or bottom edge of the screen. Before it executes any xdotool command it checks the mouse location and throws an exception. Works pretty well and I recurred to it when I made a mistake and the automation misbehaved.

Lesson learnt: automating the GUI is powerful, but also dangerous. Unlike a shell script, the GUI is an unpredictable environment and things can quickly derail. I like the way pyautogui’s documentation describes this:

Like the enchanted brooms from the Sorcerer’s Apprentice programmed to keep filling (and then overfilling) the bath with water, a bug in your program could make it go out of control. It’s hard to use the mouse to close a program if the mouse cursor is moving around on its own.

It’s a powerful, but double-edged sword. And lots of fun too! Automating the GUI is another dimension and a very useful skill to have.

GUI Automation on Linux

Curious to learn automating GUI stuff on Linux, that is, moving the mouse, clicking on buttons, pressing keys, etc.

I’m planning to learn two: xdotool and pyautogui. I like that you can give an image to pyautogui and it will find it on the screen and click on it. Pretty powerful stuff.