Commit Graph

1237 Commits

Author SHA1 Message Date
Klaas van Schelven
72ab0c68ef Log message: no need to mention 'include_never_evict'
because when you reach that point, it's always True
2025-03-19 15:43:05 +01:00
Klaas van Schelven
a2f3ad900b eviction-target not reached handling changes
this error has shown up for one of our users; I can't reproduce yet, but I can
make it better:

* log-don't-crash: not worth failing for this (drops the event, and also
  rolls back the transaction such that nothing is achieved regarding eviction)
* provide more info on-error (various counts)

NB: I've also changed the < into a <=, and combined it with a check on "loop
not done". I _think_ they are functionally equivalent, and that the new version
is simply more clear as well as slightly more efficient.

In my understanding: the old version simply looped one more time before giving
up (because it was < it needed one more iteration, and because there was no
explicit check on 'loop done' that inefficiency was needed in the old formulation).
I say "I think" because I don't have a test specific to the edge-case.
2025-03-19 15:32:39 +01:00
Klaas van Schelven
fc4aae2dea Retention tests: hit even more edge-cases 2025-03-19 14:49:29 +01:00
Klaas van Schelven
1d0c0c65ff Retention tests/clarification: filter_for_work 2025-03-19 14:33:00 +01:00
Klaas van Schelven
d3c6627556 Add a more complicated case to the retention tests
this one tests at least multiple epochs and irrelevances
2025-03-19 14:18:28 +01:00
Klaas van Schelven
1b7865d3b9 Eviction: Tests and rewrite-for-understanding of epoch_bounds_with_irrelevance 2025-03-19 11:56:55 +01:00
Klaas van Schelven
98a2ab9054 Remove nohub/CaptureExceptionMiddleware
these were development-tools in the 'cornless' disposable web-server style;
not using them now, and if I ever need them back I'll dig them out of the git history.

see https://www.bugsink.com/blog/disposable-web-servers/
2025-03-19 10:18:03 +01:00
Klaas van Schelven
8b18939c6f coverage: configure for local development 2025-03-19 10:09:51 +01:00
Klaas van Schelven
38d49f5000 Django Debug Toolbar: don't crash when not installed
It happens with some regularity that people notice the "DEBUG" setting
and try to run with DEBUG=True. Although this is not documented nor recommended
you can't really blame 'm, and it would probably help them debug their issues.

Pre-this-commit that was not possible, because the debug toolbar is usually not
installed (and on e.g. on Docker this is very annoying to do).
2025-03-19 08:53:21 +01:00
Klaas van Schelven
eb780c0008 Snappea Foreman: don't crash on "non-bullet-broof" pid-check 2025-03-19 08:53:21 +01:00
Klaas van Schelven
35c1095248 Merge pull request #63 from bugsink/dependabot/pip/python-packages-c12e06c428
Update sentry-sdk requirement from ==2.22.* to ==2.23.*
2025-03-19 08:52:09 +01:00
dependabot[bot]
9fd3aaf887 Update sentry-sdk requirement in the python-packages group
Updates the requirements on [sentry-sdk](https://github.com/getsentry/sentry-python) to permit the latest version.

Updates `sentry-sdk` to 2.23.1
- [Release notes](https://github.com/getsentry/sentry-python/releases)
- [Changelog](https://github.com/getsentry/sentry-python/blob/master/CHANGELOG.md)
- [Commits](https://github.com/getsentry/sentry-python/compare/2.22.0...2.23.1)

---
updated-dependencies:
- dependency-name: sentry-sdk
  dependency-type: direct:production
  dependency-group: python-packages
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-03-18 16:02:05 +00:00
Klaas van Schelven
caf4cef193 1.4.1 changelog 2025-03-17 16:03:30 +01:00
Klaas van Schelven
a383aee6c0 Dockerfile: pip install "psycopg[binary]"
needed for testing with the postgres DB backend

See #21, #61
2025-03-17 15:56:36 +01:00
Klaas van Schelven
492de6d94f sqlite: per-query timeout configurable 2025-03-17 09:26:15 +01:00
Klaas van Schelven
e02d04d879 Make EMAIL_TIMEOUT configurable on Docker
Fixes #60
2025-03-16 11:30:26 +01:00
Klaas van Schelven
9290d24b7e Fix typo in database-scheme check for Docker
Fix #61
2025-03-16 11:22:11 +01:00
Klaas van Schelven
b7e830a511 Snappea stale tasks: show total_seconds
as per the Python docs:

> It is a somewhat common bug for code to unintentionally use this attribute
> when it is actually intended to get a total_seconds() value instead

https://docs.python.org/3/library/datetime.html#datetime.timedelta.seconds
2025-03-13 15:41:05 +01:00
Klaas van Schelven
170222b84c Fix date on changelog 2025-03-13 15:23:44 +01:00
Klaas van Schelven
5a9bab9e20 EagerPaginator
as explained in the comment
2025-03-13 14:49:49 +01:00
Klaas van Schelven
cda7e454c9 init_tags command: avoid unbounded WAL growth 2025-03-13 13:20:35 +01:00
Klaas van Schelven
1d8d6f1ac6 'flatten' migrations for tags
unreleased migrations: preference to flatten those;
happens to also fix mysql tests (for which the datamigraion failed)
2025-03-13 09:23:25 +01:00
Klaas van Schelven
4d26252357 GitHub workflow: don't 'fail fast', i.e. run the rest of the matrix when part fails 2025-03-13 09:16:54 +01:00
Klaas van Schelven
651ed1d8c5 .github flake8: harmonize with tox.ini
tox.ini change in 348c2dc80f
2025-03-13 09:07:22 +01:00
Klaas van Schelven
4ca564b98b Show 'stored events' in 'Issue key info' 2025-03-13 09:05:29 +01:00
Klaas van Schelven
0257884a69 Search results count: intcomma 2025-03-13 09:01:21 +01:00
Klaas van Schelven
ba5c291f57 Search performance: use Event.issue when searching
In b031792784 using Event.issue was made conditional (if we already filter
by Tag, the tag encodes that info already, and it was assumed adding the
WHERE elsewhere would confuse the query optimizer).

As per that commit's message, the measurements that led me to that decision
were probably wrong. I now simply think: the more places you narrow your
search, the easier your DB will have it.

Measuring turns out: this is indeed so, for all cases (in the order of 20-30%),
for which this still matters (the present fix is on the now-less-visitied path)
2025-03-12 21:36:51 +01:00
Klaas van Schelven
175e103d23 Search optimization: when counting the results takes too long, don't 2025-03-12 21:26:47 +01:00
Klaas van Schelven
c3ed995ecc Performance (of non-search): don't count if you know the answer 2025-03-12 20:44:50 +01:00
Klaas van Schelven
1eea9268a5 Optimization: Search on EvenTag without involving Event if possible
When searching by tag, there is no need to join with Event; especially when
just counting results or determining first/last digest_order (for navigation).

(For the above "no need" to be actually true, digest_order was denormalized
into EventTag).

The above is implemented in `search_events_optimized`.

Further improvements:

* the bounds of `digest_order` are fetched only once; for first/last this info
  is reused.

* explicitly pass `event_qs_count` to the templates

* non-event pages used to calculate a "last event" to generate a tab with a
  correct event.id; since we simply have the "last" idiom, better use that.
  this also makes clear the "none" idiom was never needed, we remove it again.

Results:

Locally (60K event DB, 30K events on largest issue) my testbatch now
runs in 25% of time (overall).

* The effect on the AND-ing are in fact very large (13% runtime remaining)
* The event details page is not noticably improved.
2025-03-12 20:38:07 +01:00
Klaas van Schelven
aa341c7437 Test/debug script for search performance 2025-03-12 14:14:05 +01:00
Klaas van Schelven
cd7f3978cf Improve tag-overview performance
* denormalize IssueTag.key; this allows for key to be used in and index
  (issue, key, count).

* rewrite to grouping-first, per-key-query-second. i.e. reverts part of
  bbfee84c6a. Reasoning: I don't want to rely on "mostly unique" always
  guessing correctly, and we don't dynamically determine that yet. Which
  means that (in the single query version) if you'd have a per-event value for
  some tag, you could end up iterating over as many values as there are events,
  which won't work.

* in tags.py, do the tab-check first to avoid doing the tag-calculation twice.

* further denormalation (of key__key, of value__str) actually turns out to not
  be required for both the grouping and indivdual queries to be fast.

Performance tests, as always, against sqlite3.

--

Roads not taken/background

* This commit removes a future TODO that "A point _could_ be made for
  ['issue', '?value?' 'count']", I tried both versions of that index
  (against the group-then-query version, the only one which I trust)
  but without denormalization of key, I could not get it to be fast.

* I thought about a hybrid approach (for those keys with low counts of values
  do the single-query thing) but as it stands the extra complexity isn't worth
  it.

---
on the 1.2M events, 3 (user defined) tags / event test env this
basically lowers the time from "seconds" to "miliseconds".
2025-03-12 14:14:05 +01:00
Klaas van Schelven
b031792784 Event (tag) search: performance improvement
Done by denormalizing EventTag.issue, and adding that into an index. Targets:

* get-event-within-query (when it's 'last' or 'first')
* .count (of search query results)
* min/max (for the first/prev/next/last buttons)

(The min/max query's performance significantly improved by the addition of
the index, but was also rewritten into a simple SELECT rather than MIN/MAX).

When this code was written, I thought I had spectacularly improved performance.
I now believe this was based on an error in my measurements, but that this
still represents (mostly) an improvement, so I'll let it stand and will take
it from here in subsequent commits.
2025-03-12 14:11:43 +01:00
Klaas van Schelven
0358af9a59 Fix on 'stress test tags: support for RANDOM data'
i.e. actually send RANDOM data
2025-03-10 20:50:24 +01:00
Klaas van Schelven
14b99c3880 assertEquals -> assertEual (Python 3.12)
<<insert remarks about fashion police>>

yes this isn't the first time
2025-03-10 15:45:12 +01:00
Klaas van Schelven
c344b6ca09 Add 1.4.0 changelog 2025-03-10 15:43:11 +01:00
Klaas van Schelven
0060e86117 Fix on event-list 'n available' display 2025-03-10 09:50:49 +01:00
Klaas van Schelven
050b3fe1d8 stress test tags: support for RANDOM data 2025-03-10 09:39:17 +01:00
Klaas van Schelven
f548eab778 Merge branch 'main' into tag-search 2025-03-10 09:09:40 +01:00
Klaas van Schelven
3ee6f29f9c tags: fix the indexes
this is the part I was able to do with careful reading (and rerunning the
tests); actual performance implications will be checked based on this
2025-03-07 20:59:21 +01:00
Klaas van Schelven
f8113916dd Fix the tests
literally: the tests were always broken; in 39bddb14b7 I never
ran the tests before comitting
2025-03-07 20:46:11 +01:00
Klaas van Schelven
4b079487ca make_consistent: explain the use of 'dangling fk cleanup' 2025-03-07 16:36:10 +01:00
Klaas van Schelven
0ade3c0f86 Add a comment about DB-CASCADE 2025-03-07 16:35:36 +01:00
Klaas van Schelven
af4641a43d Tags page: case for empty 2025-03-07 14:14:05 +01:00
Klaas van Schelven
24b0c32281 make_consistent: tag-related models added 2025-03-07 13:57:22 +01:00
Klaas van Schelven
96e07c4dc3 Tags: delete EventTag when Events are evicted
and document related things
2025-03-07 13:50:10 +01:00
Klaas van Schelven
832539a197 Createsuperuser pre-start message: even more explicit 2025-03-07 10:47:18 +01:00
Klaas van Schelven
b560628c19 Createsuperuser pre-start: don't do that when _any_ users exist in the DB
Fixes #54
2025-03-07 09:52:43 +01:00
Klaas van Schelven
994e218e27 Notes on not-implemented tags 2025-03-06 15:58:54 +01:00
Klaas van Schelven
dfd15570e9 Fix on event details: when there are no tags, don't display the header 2025-03-06 15:24:11 +01:00