Automated Web Surfing Software: Automethod Web Suite

I was assigned to a very simple mission some time ago:

"We would like to be alarmed if customers are not able to proceed to the order checkout page."

Yes, we can periodically run a script and use curl in it to check if the http requests are successful, and make it send email notifications to the developers.

But hey, why don't we do some overkillin' after that and write an automated web agent just for fun? I want a real user surfing on the web by clicking, scrolling and such. Controlling http requests is lame anyways.

Maybe something like this:

selenium automated test web driver automethod

As I always do, I'm going to give you a reason to continue reading: That guy can play any browser based game by its own!

Before moving to the "blah blah section", check this video out: http://youtu.be/po9_ERgTics It demonstrates how to use the program for a very simple conditional routine:
- Open some page,
- If it is google and there is a search field on the page, search for "hello" and select the first link,
- Otherwise go to another page.
- Do this in a loop, with 5 seconds delay

(Unfortunatelly the video is in Turkish for now. English demo and tutorial is going to be uploaded with the last release.)

I used to be a fan of browser based games like you, then I didn't take an arrow in the knee and I'm still a huge fan. So why don't I have something which plays travian (or any other game) by itself? When I was playing ogame like hell, I would be very happy if I got an email like this: "Commander! <someguy> was attacking our fleet in <someplanet>! But don't worry, while you were sleeping in your comfortable bed with drool on your jaw, we managed to put all our resources to carriers and escaped the fleet just before the enemy arrived. They returned safely after he was gone. We also waited until the last second before escaping, just to screw with him."

Sending a message of this kind is just a funny glimpse of what this program can do. We can define variables, functions and routines on a GUI without writing any code. It allows the user to inject javascript on a page, by selecting pre-defined functions from drop down lists, or by writing javascript if desired.

What else can we do over the web with such a program? I'm not going to give examples here, use your imagination ;)

So what do we need for a program like that?

Let's start with this: I want to be informed when someone attacks me in ogame. How do I know if I'm being attacked, a red triangular sign appears on top of the page (very creative). So we simply need to identify the red sign, and take corresponding actions.

We need a "cause and effect" system

- If something occurs on the page, do something.

We need to define conditions and actions

conditions:
- something is present on the page.
- something is not present on the page.
- page url is something
- ... boolean result of any javascript or jquery which is executed on the page

actions:
- send email
- milk the cow
- beat the trash
- tickle the hair drier
- ... or any javascript/jquery to be run on the page if a condition is satisfied or not satisfied

We should be able to give meanings to html elements

- Red triangular sign => <img id='under_attack' src='you_are_screwed.jpg' ...>
- use firebug or something to manually get the id's and classes of the elements
- use XPATH to locate them easily, without being obligated to parse the DOM with your bare hands.

Ok, now I want my agent to click somewhere if something is present on the page. Clicking can be done in many ways; we can always use curl for get-post request and responses. The thing is; the web is MUCH MOAR than that. Especially while playing browser games, we are in a wild jungle of javascript, triggering events and making ajax calls like there is no tomorrow.

I can feel it now:
- assign the xpath of the red triangular sign which goes something like: //img[@class="under_attack"] to a variable
- do the same for the buttons to press, textfields to type, and dropdown menus to select
- define the condition: the red triangle is on the page
- define the action: make a series of clicks and selections on several places (each identified with their xpath's as variables)
- send notification with pre-defined message template filled with variables.

- do all these, or some of them conditionally, on a routine.

If I use Selenium's browser primitives, I can focus on my main objectives more effectively. They already handle event triggering (note that they all use xpath's of the html elements to locate them) on a variety of browsers. Struggling with browser compatibility is not going to help me right now. I'm going to use selenium PHP exporter to get php binding methods of the following:

click, type, select (from dropdown) find (check if an element is present on the page)...

I'm going to wrap these primitives with additional functionalities and give the user the ability to use them as pre-defined functions for defining routines and scenarios, which are going to be used by the user again to create even more complex functions.

The program is completed some time ago, but I don't have time to release it in a convinient way. A demo video is available in Turkish btw.

Canerus Stdout

Saturday, July 14, 2012