bugsink

mirror of https://github.com/jlengrand/bugsink.git synced 2026-03-10 08:01:17 +00:00

Author	SHA1	Message	Date
Klaas van Schelven	1eea9268a5	Optimization: Search on EvenTag without involving Event if possible When searching by tag, there is no need to join with Event; especially when just counting results or determining first/last digest_order (for navigation). (For the above "no need" to be actually true, digest_order was denormalized into EventTag). The above is implemented in `search_events_optimized`. Further improvements: * the bounds of `digest_order` are fetched only once; for first/last this info is reused. * explicitly pass `event_qs_count` to the templates * non-event pages used to calculate a "last event" to generate a tab with a correct event.id; since we simply have the "last" idiom, better use that. this also makes clear the "none" idiom was never needed, we remove it again. Results: Locally (60K event DB, 30K events on largest issue) my testbatch now runs in 25% of time (overall). * The effect on the AND-ing are in fact very large (13% runtime remaining) * The event details page is not noticably improved.	2025-03-12 20:38:07 +01:00
Klaas van Schelven	cd7f3978cf	Improve tag-overview performance * denormalize IssueTag.key; this allows for key to be used in and index (issue, key, count). * rewrite to grouping-first, per-key-query-second. i.e. reverts part of `bbfee84c6a`. Reasoning: I don't want to rely on "mostly unique" always guessing correctly, and we don't dynamically determine that yet. Which means that (in the single query version) if you'd have a per-event value for some tag, you could end up iterating over as many values as there are events, which won't work. * in tags.py, do the tab-check first to avoid doing the tag-calculation twice. * further denormalation (of key__key, of value__str) actually turns out to not be required for both the grouping and indivdual queries to be fast. Performance tests, as always, against sqlite3. -- Roads not taken/background * This commit removes a future TODO that "A point _could_ be made for ['issue', '?value?' 'count']", I tried both versions of that index (against the group-then-query version, the only one which I trust) but without denormalization of key, I could not get it to be fast. * I thought about a hybrid approach (for those keys with low counts of values do the single-query thing) but as it stands the extra complexity isn't worth it. --- on the 1.2M events, 3 (user defined) tags / event test env this basically lowers the time from "seconds" to "miliseconds".	2025-03-12 14:14:05 +01:00
Klaas van Schelven	b031792784	Event (tag) search: performance improvement Done by denormalizing EventTag.issue, and adding that into an index. Targets: * get-event-within-query (when it's 'last' or 'first') * .count (of search query results) * min/max (for the first/prev/next/last buttons) (The min/max query's performance significantly improved by the addition of the index, but was also rewritten into a simple SELECT rather than MIN/MAX). When this code was written, I thought I had spectacularly improved performance. I now believe this was based on an error in my measurements, but that this still represents (mostly) an improvement, so I'll let it stand and will take it from here in subsequent commits.	2025-03-12 14:11:43 +01:00
Klaas van Schelven	0060e86117	Fix on event-list 'n available' display	2025-03-10 09:50:49 +01:00
Klaas van Schelven	f548eab778	Merge branch 'main' into tag-search	2025-03-10 09:09:40 +01:00
Klaas van Schelven	3ee6f29f9c	tags: fix the indexes this is the part I was able to do with careful reading (and rerunning the tests); actual performance implications will be checked based on this	2025-03-07 20:59:21 +01:00
Klaas van Schelven	af4641a43d	Tags page: case for empty	2025-03-07 14:14:05 +01:00
Klaas van Schelven	dfd15570e9	Fix on event details: when there are no tags, don't display the header	2025-03-06 15:24:11 +01:00
Klaas van Schelven	39bddb14b7	handled: searchable as a tag also: don't display this in the detail view when the value isn't actually in the data	2025-03-06 15:19:55 +01:00
Klaas van Schelven	977aae1c25	Show remaining (in db, AKA 'available') number of events in the issue-list prompted by a user being confused about the number of events in their DB; not 100% sure I'll keep this info here, but I'm introducing it for now at least	2025-03-06 13:32:32 +01:00
Klaas van Schelven	646b1ea090	Details page: be robust for top-level message-as-string Fix #55	2025-03-06 13:09:29 +01:00
Klaas van Schelven	20a54381dc	Refactor: move tags/search stuff to its own module	2025-03-06 09:26:35 +01:00
Klaas van Schelven	37fcc348c3	Preserve query when navigating from issues to events and vice versa	2025-03-05 16:53:18 +01:00
Klaas van Schelven	f76d3f4f40	Merge branch 'main' into tag-search	2025-03-05 16:05:17 +01:00
Klaas van Schelven	381a5caae4	Issue.calculated_* fields: fix lengths as in `a717dd7374`, but for Issue as well as Event. The need for this was exposed by running the testsuite against mysql; this commit fixes the tests.	2025-03-05 11:14:19 +01:00
Klaas van Schelven	4cde74d7cb	Event search: first version	2025-03-04 13:51:56 +01:00
Klaas van Schelven	0cbdae9411	_get_events helper: clarify edge-cases In `b76e474ef1`, the event-navigation was changed into the next/prev idiom (I think completely, i.e. also from the .html files, but did not check) but the elif structure and error message did not fully reflect that (it still talked about digest_order/id, but nav is now one of the primary methods) I briefly considered removing the lookup-by-digest-order-only, but I figure it may come in handy at some point (if only for users to directly edit the url) and did not check whether this is actually unused.	2025-03-04 09:59:03 +01:00
Klaas van Schelven	0de8261440	Restore mostly_unique filter botched in `bbfee84c6a`	2025-03-03 15:58:20 +01:00
Klaas van Schelven	a00a815261	Merge branch 'main' into tag-search	2025-03-03 15:02:13 +01:00
Klaas van Schelven	70b77d71ee	Fix conditional on 'Issue Tags' RHS	2025-03-03 13:42:36 +01:00
Klaas van Schelven	bbfee84c6a	issue tags: single query rather than one-per-tag	2025-03-03 13:42:18 +01:00
Klaas van Schelven	e6bc660731	Add note about per-key tag pages	2025-03-03 13:25:41 +01:00
Klaas van Schelven	1ae5bb3fd1	Tags: no cutoff when there are many this idea was superceded by doing it explicitly in `00c49443eb`	2025-03-03 13:23:52 +01:00
Klaas van Schelven	5930740e0b	Tags: as a separate tab	2025-03-03 12:56:20 +01:00
Klaas van Schelven	c8ecf508de	Tags: on event details page show calculated tags (not just the explicitly provided ones)	2025-03-03 11:29:07 +01:00
Klaas van Schelven	124f90b403	'Issue Tags' box: show on all issue-related pages now that it's no longer tied to the event...	2025-03-03 11:00:11 +01:00
Klaas van Schelven	00c49443eb	Add 'mostly_unique' property to tags	2025-03-03 10:52:28 +01:00
Klaas van Schelven	1571a4f87f	Linebreaks in event detail values Prompted by .message; but more generally useful and applied. Fixes #51	2025-03-03 09:12:06 +01:00
Klaas van Schelven	60920b7299	fix 'n total observed' at top of event-list page this was showing the project's count since `336e126e3e`, i.e. since it was introduced	2025-02-28 11:29:21 +01:00
Klaas van Schelven	7a30de3840	Issue Tags: select_related	2025-02-28 10:09:53 +01:00
Klaas van Schelven	60e25dac42	Issue Tags display: 'Other', sorting (WIP)	2025-02-28 09:48:01 +01:00
Klaas van Schelven	2c444c6e80	Display Issue (not event) tags in the RHS detail; WIP	2025-02-28 09:33:58 +01:00
Klaas van Schelven	d5228f9932	Add 'level' to logentry event details	2025-02-27 15:26:10 +01:00
Klaas van Schelven	ef3b19d794	UI: Specific 'no issues found' when searching	2025-02-27 13:18:33 +01:00
Klaas van Schelven	7a19e2d277	Tags; deducing tags; search on tags; WIP	2025-02-27 13:12:49 +01:00
Klaas van Schelven	4b7ed8f4ec	Rename get_contexts_enriched_with_ua more closely match what's going on	2025-02-26 18:20:52 +01:00
Klaas van Schelven	934764dd8c	Tests: Better error message for failing integration tests	2025-02-20 09:26:04 +01:00
Klaas van Schelven	bc1541050d	thousand-separators for counts useful when there are very many events	2025-02-19 16:58:54 +01:00
Klaas van Schelven	10f8e10607	DB indexes for the issue-lits (including filters) simply by reasoning about what they should be; no performance testing (on the issue-list and on the event-ingestion) was done for these)	2025-02-18 10:32:06 +01:00
Klaas van Schelven	2cb87f8334	Issues list pagination	2025-02-18 09:47:30 +01:00
Klaas van Schelven	443014f0b1	Dead code removal this comment was added for-reference, was accidentally checked in	2025-02-17 16:45:10 +01:00
Klaas van Schelven	4c4d589f2b	Fix on `ac3badbdb1`	2025-02-14 12:25:10 +01:00
Klaas van Schelven	6f4acf216e	Show message.message in event details Fix #43	2025-02-14 12:20:56 +01:00
Klaas van Schelven	ac3badbdb1	Logentry: show in event-details it's useful info (and the now-removed comment was also untrue)	2025-02-14 11:35:38 +01:00
Klaas van Schelven	a717dd7374	Truncate input-data that exceeds max_length Avoiding any (1406, "Data too long for column ...") on MySQL. For the 'plainly provided' fields I followed the documented maximums which are also our DB maximums. For calculated_* I harmonized with what Sentry & GlitchTip both do (and which was already partially reflected in the code), i.e. 128 and 1024.	2025-02-08 21:21:55 +01:00
Klaas van Schelven	561c1d324a	event.data getters in preparation for scenarios where the dumped data is not stored in the DB	2025-02-07 17:09:36 +01:00
Klaas van Schelven	615d2da4c8	Chache stored_event_count (on Issue and Projet) "possibly expensive" turned out to be "actually expensive". On 'emu', with 1.5M events, the counts take 85 and 154 ms for Project and Issue respectively; bottlenecking our digestion to ~3 events/s. Note: this is single-issue, single-project (presumably, the cost would be lower for more spread-out cases) Note on indexes: Event already has indexes for both Project & Issue (though as the first item in a multi-column index). Without checking further: that appears to not "magically solve counting". This commit also optimizes the .count() on the issue-detail event list (via Paginator). This commit also slightly changes the value passed as `stored_event_count` to be used for `get_random_irrelevance` to be the post-evication value. That won't matter much in practice, but is slightly more correct IMHO.	2025-02-06 16:24:25 +01:00
Klaas van Schelven	86e8c4318b	Add indexes on fields on which we order and vice versa Triggered by issue_event_list being more than 5s on "emu" (my 1,500,000 event test-machine). Reason: sorting those events on non-indexed field. Switching to a field-with-index solved it. I then analysed (grepped) for "ordering" and "order_by" and set indexes accordingly and more or less indiscriminately (i.e. even on tables that are assumed to have relatively few rows, such as Project & Team).	2025-02-04 21:19:24 +01:00
Klaas van Schelven	0b42d3ff1e	Semi-manual squash-migrations ## Goal Reduce the number of migrations for _fresh installs_ of Bugsink. This implies: squash as broadly as possible. ## How? "throw-away-and-rerun". In particular, for a given app: * throw away the migrations from some starting point up until and including the last one. * run "makemigrations" for that app. Django will see what's missing and just redo it * rename to 000n_b_squashed or similar. * manually set a `replaces` list on the migration to the just-removed migrations * manually check dependencies; check that they are: * as low as possible, e.g. an FK should only depend on existence. this reduces the risk of circular dependencies. * pointing to "original migrations", i.e. not to a just-created squashed migration. because the squashed migrations "contain a lot" they increase the risk of circular dependencies. * restore (git checkout) the thrown-away migration ## Further tips: * "Some starting point" is often not 0000, but some higher number (see e.g. the outcome in the present commit). Leaving the migrations for creation of base models (Event, Issue, Project) in place saves you from a lot of circular dependency problems. * Move db.sqlite3 out of the way to avoid superfluous warnings. ## RunPython worries I grepped for RunPython in the replaced migrations, with the following results: * phonehome's create_installation_id was copied-over to the squashed migration. * all others where ignored, because: * they "do something with events", i.e. only when events are present will they have an effect. This means they are no-ops for _new installs_. * for existing installs, for any given app, they will only be missed (replaced) when the first replaced migration is not yet executed. I used the following command (reading from the bottom) to establish that this means only people that did a fresh install after `8ad6059722` (June 14, 2024), but before `c01d332e18` (July 16) _and then never did any upgrades_ would be affected. There are no such people. git log --name-only \ events/migrations/0004_event_irrelevance_for_retention.py \ issues/migrations/0004_rename_event_count_issue_digested_event_count.py \ phonehome/migrations/0001_initial.py \ projects/migrations/0002_initial.py \ teams/migrations/0001_initial.py Note that the above observation still be true for the next squashmigration (assuming squashing starting at the same starting migrations). ## Cleanup of the replaced migrations Django says: > Once you’ve squashed your migration, you should then commit it alongside the > migrations it replaces and distribute this change to all running instances of your > application, making sure that they run migrate to store the change in their database. Given that I'm not in control of all running instances of my application, this means the cleanup must not happen "too soon", and only after announcing a migration path ("update to version X before updating to version Y"). ## Roads not taken Q: Why not just do squashmigrations? A: It didn't work reliably (for me), presumably b/c of the high number of strongly interdependant apps in combination with some RunPython. Seen after I was mostly done, not explored seriously (yet): * https://github.com/3YOURMIND/django-replace-migrations * https://pypi.org/project/django-squash/ * https://django-extensions.readthedocs.io/en/latest/delete_squashed_migrations.html	2025-02-03 16:06:17 +01:00
Klaas van Schelven	0ec809cbb3	Simplify migration deps and document them	2025-02-03 14:04:44 +01:00

1 2 3 4 5 ...

476 Commits