Fix out-of-date rows in ft_notification_status

This can happen in the following scenario (primarily for letters):

1. A service has a mixture of "delivered" and "sending" letters,
which the status task aggregates into two rows:

  sending | 123
  delivered | 456

2. After the 7 day retention has passed, only the "delivered" letters
will be archived [^1].

3. The status task now looks at the history table [^2], which means
it only sees the "delivered" letters.

4. The "sending" letters are eventually "delivered" and archived (before
the 10 day aggregation cutoff).

5. But the status aggregation task doesn't run.

This commit fixes (5).

[^1]: https://github.com/alphagov/notifications-api/pull/3063
[^2]: f87ebb094d/app/dao/fact_notification_status_dao.py (L51)
This commit is contained in:
Ben Thorner
2022-05-10 11:14:59 +01:00
parent 867e8fbce3
commit ed379a3724
2 changed files with 33 additions and 10 deletions

View File

@@ -13,7 +13,7 @@ from notifications_utils.recipients import (
validate_and_format_email_address,
)
from notifications_utils.timezones import convert_bst_to_utc, convert_utc_to_bst
from sqlalchemy import and_, asc, desc, func, or_
from sqlalchemy import and_, asc, desc, func, or_, union
from sqlalchemy.orm import joinedload
from sqlalchemy.orm.exc import NoResultFound
from sqlalchemy.sql import functions
@@ -807,14 +807,26 @@ def get_service_ids_with_notifications_on_date(notification_type, date):
start_date = get_london_midnight_in_utc(date)
end_date = get_london_midnight_in_utc(date + timedelta(days=1))
notification_table_query = db.session.query(
Notification.service_id.label('service_id')
).filter(
Notification.notification_type == notification_type,
# using >= + < is much more efficient than date(created_at)
Notification.created_at >= start_date,
Notification.created_at < end_date,
)
# Looking at this table is more efficient for historical notifications,
# provided the task to populate it has run before they were archived.
ft_status_table_query = db.session.query(
FactNotificationStatus.service_id.label('service_id')
).filter(
FactNotificationStatus.notification_type == notification_type,
FactNotificationStatus.bst_date == date,
)
return {
row.service_id
for row in db.session.query(
Notification.service_id
).filter(
Notification.notification_type == notification_type,
# using >= + < is much more efficient than date(created_at)
Notification.created_at >= start_date,
Notification.created_at < end_date,
).distinct()
row.service_id for row in db.session.query(union(
notification_table_query, ft_status_table_query
).subquery()).distinct()
}