Monitor elements on a webpage with NodeJS

NodeJS logo
For one of my projects I needed a tool to automatically monitor specific values (number of comments/downloads etc) on web pages. After seeing this article on Hacker News I thought that it would be interesting to write the tool with NodeJS myself.

Features

  • Monitor the value of a specific element based on a CSS Selector
  • Alert for changes on said element
  • Default URL/selector or dynamic as command line arguments

Prerequisites

Setup

jsdom

In order to fetch the page’s HTML and select the element by a CSS Selector we are going to use jsdom. jsdom can parse HTML from a string, file or in our case a URL and then expose a window variable similar to the one we have on the browser.

So lets install jsdom by running:

1
$ npm install jsdom

CSS Selector

Finding a unique CSS Selector for a page’s element is quite easy by using our browser’s developer tools.

Finding a unique CSS Selector for a page's element with chrome

Code

Now that we have everything we need, create index.js and lets jump into code.

jsdom & variables

First we will include jsdom and declare our basic variables

index.js
1
2
3
4
5
6
const jsdom = require("jsdom");
const delay = 10; //loop delay in seconds
let url = 'https://news.ycombinator.com/news'; //default url, change this to whatever you want
let ccsSelector = '#score_13635230'; //default selector, change this to whatever you want
let previousValue; //we need to store the value in order to compare it with the next one

checkValue function

Now lets create a function that gets the html and finds the value we want.
We will pass two arguments to jsdom, the HTML’s source(our url) and a callback function. In the callback we have access to the window object so it’s easy to extract the value with native DOM methods.

index.js
1
2
3
4
5
6
7
8
9
function checkValue(){
jsdom.env(
url,
function (err, window) {
//get the element's value
const newValue = window.document.querySelector(ccsSelector).textContent;
}
);
}

Handling the new value

Now that we have the value we have to decide what to do, there are 3 possibilities:

  • It’s the first time we checked the value, lets print it
  • The value changed from the last time we checked it, lets print both old and new values
  • The value didn’t change, do nothing

Lets write it into code and add it into our callback

index.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
function checkValue(){
jsdom.env(
url,
function (err, window) {
//get the element's value
const newValue = window.document.querySelector(ccsSelector).textContent;
if( typeof previousValue==='undefined' ){ //check if this is the first call
console.log(`The value is ${newValue}`);
} else if( previousValue!==newValue ){ //check if the value changed from the last call
console.log(`Value changed from ${previousValue} to ${newValue}`);
}
previousValue = newValue; //update the value
}
);
}

Exiting on error

If jsdom returns an error lets print it and stop the program

index.js - checkValue()
1
2
3
4
5
6
7
8
9
10
...
function (err, window) {
if( err ){
console.log(err);
process.exit();
}
//get the element's value
const newValue = window.document.querySelector(ccsSelector).textContent;
...

Repeating the check

The checkValue function is ready, lets make it run repeatedly with setInterval().
setInterval() will wait the specified delay before the first call so we will invoke the function manually for the first time.

index.js
1
2
3
4
...
checkValue();
setInterval(checkValue, delay*1000); //start invoking the function, *1000 because the delay should be in milliseconds
...

Our code so far

index.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
const jsdom = require("jsdom");
const delay = 10;
let url = 'https://news.ycombinator.com/news';
let ccsSelector = '#score_13635230';
let previousValue;
checkValue();
setInterval(checkValue, delay*1000);
function checkValue(){
jsdom.env(
url,
function (err, window) {
if( err ){
console.log(err);
process.exit();
}
const newValue = window.document.querySelector(ccsSelector).textContent;
if( typeof previousValue==='undefined' ){
console.log(`The value is ${newValue}`);
} else if( previousValue!==newValue ){
console.log(`Value changed from ${previousValue} to ${newValue}`);
}
previousValue = newValue;
}
);
}

Lets run it with

1
$ node index

After a while you should see something like

1
2
3
The value is 326 points
Value changed from 326 points to 327 points
Value changed from 327 points to 328 points

Checking if the url contains the ‘http://‘ prefix

In order for jsdom to treat our string as a url and fetch the HTML we have to make sure that the ‘http://‘ or ‘https://‘ prefixes are included.

index.js
1
2
3
4
5
6
7
8
9
10
...
let previousValue;
if( url.indexOf('http')!==0 ){
url = 'http://'+url;
}
checkValue();
setInterval(checkValue, delay*1000);
...

Dynamic url/selector from command line arguments

In node we can find the arguments at the process.argv array, the first 2 values will be node and index(from index node) so our values will be on the last 2 items([2] & [3]).

index.js
1
2
3
4
5
6
7
8
9
10
11
...
let previousValue;
if( process.argv.length==4 ){ //overwrite only when the arguments are available
url = process.argv[2];
ccsSelector = process.argv[3];
}
if( url.indexOf('http')!==0 ){
url = 'http://'+url;
...

Now we can override the default url/selector when starting the program

1
$ node index "https://example.com" "#someSelector"

Final code

index.js
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
const jsdom = require("jsdom");
const delay = 10;
let url = 'https://news.ycombinator.com/news';
let ccsSelector = '#score_13635230';
let previousValue;
if( process.argv.length==4 ){
url = process.argv[2];
ccsSelector = process.argv[3];
}
if( url.indexOf('http')!==0 ){
url = 'http://'+url;
}
checkValue();
setInterval(checkValue, delay*1000);
function checkValue(){
jsdom.env(
url,
function (err, window) {
if( err ){
console.log(err);
process.exit();
}
const newValue = window.document.querySelector(ccsSelector).textContent;
if( typeof previousValue==='undefined' ){
console.log(`The value is ${newValue}`);
} else if( previousValue!==newValue ){
console.log(`Value changed from ${previousValue} to ${newValue}`);
}
previousValue = newValue;
}
);
}

Closing

You can find the final code with comments on the GitHub repo. Future improvements will probably be a frontend that gets updated on changes via sockets(already under development) and simultaneous checking of many webpages.