At the moment the first AJAX call is triggered as soon as the page
loads. We then look at its response time to work out how long to wait
until making the next call.
This isn’t great because:
- stuff is unlikely to have changed straight away, so it’s a waste of a
call
- while the DOM is being updated it seems to introduce a delay in
clicks on links, which is either more pronounced or more noticeable
when it’s happening straight away, making the UI feel less snappy
I chose a value of 2 seconds as a rough proxy for the minimum time we’d
expect to see a notification go from created to delivered. Median
time-to-delivered was 2.9 seconds when we analysed it for
https://github.com/alphagov/notifications-admin/pull/2974#discussion_r286101286
By default our AJAX calls were 2 seconds. Then they were 5 seconds
because someone reckoned 2 seconds was putting too much load on the
system. Then we made them 10 seconds while we were having an incident.
Then we made them 20 seconds for the heaviest pages, but back to 5
seconds or 2 seconds for the rest of the pages.
This is not a good situation because:
- it slows all services down equally, no matter how much traffic they
have, or which features they have switched on
- it slows everything down by the same amount, no matter how much load
the platform is under
- the values are set based on our worst performance, until we manually
remember to switch them back
- we spend time during incidents deploying changes to slow down the
dashboard refresh time because it’s a nothing-to-lose change that
might relieve some symptoms, when we could be spending time digging
into the underlying cause
This pull request makes the Javascript smarter about how long it waits
until it makes another AJAX call. It bases the delay on how long the
server takes to respond (as a proxy for how much load the server is
under).
It’s based on the square root of the response time, so is more sensitive
to slow downs early on, and less sensitive to slow downs later on. This
helps us give a more pronounced difference in delay between an AJAX call
that is fast (for example the page for a single notification) and one
that is slow (for example a dashboard for a service with lots of
traffic).
*Some examples of what this would mean for various pages*
Page | Response time | Wait until next AJAX call
---|---|---
Check a reply to address | 130ms | 1,850ms
Brand new service dashboard | 229ms | 2,783ms
HM Passport Office dashboard | 634ms | 5,294ms
NHS Coronavirus Service dashboard | 779ms | 5,977ms
_Example of the kind of slowness we’ve seen during an incident_ | 6,000ms | 18,364ms
GOV.UK email dashboard | `HTTP 504` | 😬
This is mainly because tests don't inferr that
global variables are just properties of the window
object, as browsers do, but it also makes this
more explicit.
AJAX requests stop on success or failure, as the waiting page
does not have to referesh any longer.
Also on failure a form that allows user to try again
is shown.
Since it moved to ES Modules in version 2.3.1,
diff-dom stopped including the `diffDOM.js` file
in its NPM package.
We don't do any kind of bundling in our build yet,
just concatenation of our scripts and some
minification of the results so we can't take
advantage of this yet.
The `diffDOM.js` file is still available in the
Github release so this moves to referencing that
in the `package.json` instead, until we start
using a bundler.
I opened an issue to check this is what the author
intended:
https://github.com/fiduswriter/diffDOM/issues/84
The latest version also adds Rollup as a peer
dependency.
See parent commit for the reason we’re doing this.
Currently our AJAX requests only work as `GET` requests. So this commit
does a bit of work to make them work as `POST` requests. This is
optional behaviour, and will only happen when the element in the page
that should be updated with AJAX has the `data-form` attribute set. It
will take the form that has the corresponding `id`, serialise it, and
use that data as the body of the post request. If not form is specified
it will not do the serialisation, and submit as a `GET` request as
before.
The pages with AJAX on were feeling quite sluggish, and it felt like
they were making the whole browser slow down.
Looking at the performance stuff in Chrome, the number of DOM nodes was
going up and up, which is weird because the number of things on the page
wasn’t changing. This was causing the page to consume more and more
memory in order to store all these nodes.
This is kinda beyond my understanding, but I tried a few things and it
looks like the browser was having a hard time garbage collecting the
temporary bits of DOM used to update the page.
By assinging these bits of DOM to variables before using them it seems
that the browser has an easier time cleaning them up.
With the change in implementation in the previous commit, any error
(eg server responds with `500`) would cause the page to not be updated
again.
This is better than the previous implementation, whereby the browser
would re-request as fast as it could until it got a successful response.
This commit handles errors by clearing the render queue if the server
returns an error. So:
- Any updates that would have been performed based on this request are
dropped
- Subsequent updates will be attempted as if it was the first load of
the page, ie after a delay of `x` seconds
The `updateContent` module updates a section of the page based on an
AJAX request.
The current implementation will poll every `x` seconds. If the server
takes a long time to respond, this results in a the browser queuing up
requests. If a particular endpoint is slow, this can result in one
client making enough requests to slow down the whole server, to the
point where it gets removed from the loadbalancer, eventually bringing
the whole site down.
This commit rewrites the module so that it queues up the render
operations, not the requests.
There is one queue per endpoint, so for
`http://example.com/endpoint.json`:
1. Queue is empty
```javascript
{
'http://example.com/endpoint.json': []
}
```
2. Inital re-render is put on the queue…
```javascript
{
'http://example.com/endpoint.json': [
function render(){…}
]
}
```
…AJAX request fires
```
GET http://example.com/endpoint.json
```
3. Every `x` seconds, another render operation is put on the queue
```javascript
{
'http://example.com/endpoint.json': [
function render(){…},
function render(){…},
function render(){…}
]
}
```
4. AJAX request returns queue is flushed by executing each queued
render function in sequence
```javascript
render(response); render(response); render(response);
```
```javascript
{
'http://example.com/endpoint.json': []
}
```
5. Repeat
This means that, at most, the AJAX requests will never fire more than
once every `x` seconds, where `x` defaults to `1.5`.
Currently, when we update a section of the page with AJAX we replace the
entire HTML of the section with the new HTML. This causes problems:
- if you’re trying to interact with that section of the page, eg by
inpecting it, clicking or hovering an element
- (probably) for screenreaders trying to navigate a page which is
changing more than is necessary
This commit replaces the call to `.html()` with a pretty clever library
called diffDOM[1]. DiffDOM works by taking a diff of the old element and
the new element, then doing a patch update, ie only modifying the parts
that have changed.
This is similar in concept to React’s virtual DOM, while still allowing
us to render all markup from one set of templates on the server-side.
1. https://github.com/fiduswriter/diffDOM
Some pages with AJAX should update quickly, because the data is likely
to be changing quickly, and be finished changing sooner. Other pages we
want to have tick over a bit slower.
This commit adds an optional ‘interval’ parameter to the updateContent
modules, which sets how often the page should ping the server for an
update.
It then sets the interval for the dashboard page to be 10 seconds,
rather than the default 1.5 seconds.
This is a first go at having the job page update without refreshing.
The approach I’ve taken is to do all the rendering of HTML on the server side,
rather than use a Javascipt templating engine like mustache. This ensures that
we don’t have to maintain two sets of templates.
So the approach is to split the job page into partials. These partials can then:
- be included in the job page to render the whole page
- be rendered indivudually and then returned as a blob of HTML inside a JSON
response
Then I’ve added a Javascript module which looks for areas of the page that should
be reloaded. For each area of the page it will poll a URL and re-render that
section of the page when it gets new HTML. It implements some throttling so that
API calls will never happen more frequently than 0.67 times/second.