bugsink

mirror of https://github.com/jlengrand/bugsink.git synced 2026-03-10 08:01:17 +00:00

Author	SHA1	Message	Date
Klaas van Schelven	13dbc4dd29	Use remote_addr for '{{auto}}' ip_addr tags See #165	2025-07-28 11:12:53 +02:00
Klaas van Schelven	770ccb1622	Fixed command's 'running in background' output 'Oskar' on discord pointed out 2 distinct commands had identical output which was confusing	2025-07-14 16:12:25 +02:00
Klaas van Schelven	1965b0f8c2	vacuum_eventless_issuetags: tune batch-size See #134	2025-07-08 16:16:54 +02:00
Klaas van Schelven	674d84909f	TagValue pruning (for vacuum_eventless_issuetags)	2025-07-08 15:59:55 +02:00
Klaas van Schelven	9741844821	vacuum_eventless_issuetags: tests (and minor fix) See #134	2025-07-08 15:32:43 +02:00
Klaas van Schelven	4dd525d0d0	Missing import	2025-07-08 15:29:14 +02:00
Klaas van Schelven	d62e53fdf8	store_tags: support 'very many' (~500) tags	2025-07-08 15:21:26 +02:00
Klaas van Schelven	a247528baa	TagKey __str__	2025-07-08 15:06:46 +02:00
Klaas van Schelven	dc25e044f0	Add store_tags test for 2 separate Issues (there were some doubts whether this works; this test takes those doubts away)	2025-07-08 15:06:20 +02:00
Klaas van Schelven	7f416ac920	vacuum_eventless_issuetags command In the light of the discussion on #134, this implements the "clean up later" solution: a vacuum task that deletes IssueTags no longer referenced by any EventTag on the same Issue. This doesn't prevent stale IssueTags from being created but ensures they are eventually removed, enabling follow-up cleanup (e.g. of TagValues). Performance-wise, this is a relatively safe path forward; it can run off-hours or not at all, depending on preferences. Semantically it's the least clear: whether an Issue appears to be tagged may now depend on whether vacuum has run. No tests yet; no immediate TagValue cleanup.	2025-07-08 13:21:57 +02:00
Klaas van Schelven	28b2ce0eaf	Various models: .project SET_NULL => DO_NOTHING Like `e45c61d6f0`, but for .project. I originally thought `SET_NULL` would be a good way to "do stuff later", but that's only so the degree that [1] updates are cheaper than deletes and [2] 2nd-order effects (further deletes in the dep-tree) are avoided. Now that we have explicit Project-deletion (deps-first, delayed, properly batched) the SET_NULL behavior is always a no-op (but with cost in queries). As a result, in the test for project deletion (which has deletes for many of the altered models), the following 12 queries are no longer done: ``` SELECT "projects_project"."id", [..many fields..] FROM "projects_project" WHERE "projects_project"."id" = 1 DELETE FROM "projects_projectmembership" WHERE "projects_projectmembership"."project_id" IN (1) DELETE FROM "alerts_messagingserviceconfig" WHERE "alerts_messagingserviceconfig"."project_id" IN (1) UPDATE "releases_release" SET "project_id" = NULL WHERE "releases_release"."project_id" IN (1) UPDATE "issues_issue" SET "project_id" = NULL WHERE "issues_issue"."project_id" IN (1) UPDATE "issues_grouping" SET "project_id" = NULL WHERE "issues_grouping"."project_id" IN (1) UPDATE "events_event" SET "project_id" = NULL WHERE "events_event"."project_id" IN (1) UPDATE "tags_tagkey" SET "project_id" = NULL WHERE "tags_tagkey"."project_id" IN (1) UPDATE "tags_tagvalue" SET "project_id" = NULL WHERE "tags_tagvalue"."project_id" IN (1) UPDATE "tags_eventtag" SET "project_id" = NULL WHERE "tags_eventtag"."project_id" IN (1) UPDATE "tags_issuetag" SET "project_id" = NULL WHERE "tags_issuetag"."project_id" IN (1) ```	2025-07-03 21:49:49 +02:00
Klaas van Schelven	3b3ce782c5	Fix the tests for prev. commit tests were broken b/c not respecting constraints / not properly using the factories.	2025-07-03 11:33:58 +02:00
Klaas van Schelven	e45c61d6f0	Various models: .issue and .grouping; SET_NULL => DO_NOTHING I originally thought `SET_NULL` would be a good way to "do stuff later", but that's only so the degree that [1] updates are cheaper than deletes and [2] 2nd-order effects (further deletes in the dep-tree) are avoided. Now that we have explicit Issue-deletion (deps-first, delayed, properly batched) the SET_NULL behavior is always a no-op (but with cost in queries). As a result, in the test for issue deletion (which has deletes for many of the altered models), the following 8 queries are no longer done: ``` SELECT "issues_grouping"."id", [..many fields..] FROM "issues_grouping" WHERE "issues_grouping"."id" IN (1) UPDATE "events_event" SET "grouping_id" = NULL WHERE "events_event"."grouping_id" IN (1) [.. a few moments later..] SELECT "issues_issue"."id", [..many fields..] FROM "issues_issue" WHERE "issues_issue"."id" = 'uuid' UPDATE "issues_grouping" SET "issue_id" = NULL WHERE "issues_grouping"."issue_id" IN ('uuid') UPDATE "issues_turningpoint" SET "issue_id" = NULL WHERE "issues_turningpoint"."issue_id" IN ('uuid') UPDATE "events_event" SET "issue_id" = NULL WHERE "events_event"."issue_id" IN ('uuid') UPDATE "tags_eventtag" SET "issue_id" = NULL WHERE "tags_eventtag"."issue_id" IN ('uuid') UPDATE "tags_issuetag" SET "issue_id" = NULL WHERE "tags_issuetag"."issue_id" IN ('uuid') ``` (breaks the tests b/c of constraints and not always using factories; will fix next)	2025-07-03 11:33:58 +02:00
Klaas van Schelven	e58be0018f	Tag models: no CASCADE CASCADE was defined for keys & values, but in practice those are never directly deleted except in the very case in which it has been established that they are 'orphaned', i.e. no longer being referrred to. That's exactly the case in which CASCADE is superfluous. As a result, in the test for issue deletion (which contains a prune of tagvalue), the following 3 queries are no longer done: ``` SELECT "tags_tagvalue"."id", "tags_tagvalue"."project_id", "tags_tagvalue"."key_id", "tags_tagvalue"."value" FROM "tags_tagvalue" WHERE "tags_tagvalue"."id" IN (1) DELETE FROM "tags_eventtag" WHERE "tags_eventtag"."value_id" IN (1) DELETE FROM "tags_issuetag" WHERE "tags_issuetag"."value_id" IN (1) ```	2025-07-03 11:33:58 +02:00
Klaas van Schelven	ee9add5e5f	Vacuum Tags command See #135	2025-07-02 21:43:51 +02:00
Klaas van Schelven	38397bf2f2	Remove superfluous comment	2025-06-27 12:57:24 +02:00
Klaas van Schelven	e5dbeae514	Issue.delete_deferred(): first version (WIP) Implemented using a batch-wise dependency-scanner in delayed (snappea) style. * no tests yet. * no real point-of-entry in the (regular, non-admin) UI yet. * no hiding of Issues which are delete-in-progress from the UI * file storage not yet cleaned up * project issue counts not yet updated * dangling tag values: no cleanup mechanism yet. See #50	2025-06-27 12:52:59 +02:00
Klaas van Schelven	4ca15c7159	fix make_consistent on mysql Problem: on mysql `make_consistent` cannot always clean up `Event`s, because `EventTag` objects still point to them, leading to an integrityerror. The problem does not happen for `sqlite`, because sqlite does FK-checks on-commit. And the offending `EventTag` objects are "eventually cleaned up" (in the same transaction, in make_consistent) This is the "mostly works" solution, for the scenario we've encountered. Namely: remove EventTags which have no issue before removing Events. This works in practice because of the way Events-to-cleanup were created in the UI in practice, namely by removal of some Issue in the admin, triggering a `SET_NULL` on the `issue_id`. Removal of issue implies an analagous `SET_NULL` on the `EventTag`'s `issue_id`, and by removing those `EventTag`s before proceeding with the `Event`s, you avoid the FK constraint triggering. We don't want to fully reimplement `CASCADE` (as in Django) here, and the values of `on_delete` are "Design Decision Needed" and non-homogonous anyway, and we might soon implement proper deletions (see #50) anyway, so the "mostly works" solution will have to do for now. Fixes #132	2025-06-26 10:56:31 +02:00
Klaas van Schelven	70e0f147b5	Tags in event_data can be lists; deal with that Fix #130	2025-06-26 09:06:21 +02:00
Klaas van Schelven	abaa1d9b2f	Don't crash on non-str tag-values Fixes #76	2025-04-06 15:00:54 +02:00
Klaas van Schelven	5d4271e350	Add user.etc tags in deduce_tags	2025-04-05 08:14:11 +02:00
Klaas van Schelven	2d51426618	Fix user tag deduction although it looks (in the UI) like user info is a context, it's really just a top-level attribute in the event-data	2025-03-31 09:42:29 +02:00
Klaas van Schelven	cda7e454c9	init_tags command: avoid unbounded WAL growth	2025-03-13 13:20:35 +01:00
Klaas van Schelven	1d8d6f1ac6	'flatten' migrations for tags unreleased migrations: preference to flatten those; happens to also fix mysql tests (for which the datamigraion failed)	2025-03-13 09:23:25 +01:00
Klaas van Schelven	ba5c291f57	Search performance: use Event.issue when searching In `b031792784` using Event.issue was made conditional (if we already filter by Tag, the tag encodes that info already, and it was assumed adding the WHERE elsewhere would confuse the query optimizer). As per that commit's message, the measurements that led me to that decision were probably wrong. I now simply think: the more places you narrow your search, the easier your DB will have it. Measuring turns out: this is indeed so, for all cases (in the order of 20-30%), for which this still matters (the present fix is on the now-less-visitied path)	2025-03-12 21:36:51 +01:00
Klaas van Schelven	1eea9268a5	Optimization: Search on EvenTag without involving Event if possible When searching by tag, there is no need to join with Event; especially when just counting results or determining first/last digest_order (for navigation). (For the above "no need" to be actually true, digest_order was denormalized into EventTag). The above is implemented in `search_events_optimized`. Further improvements: * the bounds of `digest_order` are fetched only once; for first/last this info is reused. * explicitly pass `event_qs_count` to the templates * non-event pages used to calculate a "last event" to generate a tab with a correct event.id; since we simply have the "last" idiom, better use that. this also makes clear the "none" idiom was never needed, we remove it again. Results: Locally (60K event DB, 30K events on largest issue) my testbatch now runs in 25% of time (overall). * The effect on the AND-ing are in fact very large (13% runtime remaining) * The event details page is not noticably improved.	2025-03-12 20:38:07 +01:00
Klaas van Schelven	cd7f3978cf	Improve tag-overview performance * denormalize IssueTag.key; this allows for key to be used in and index (issue, key, count). * rewrite to grouping-first, per-key-query-second. i.e. reverts part of `bbfee84c6a`. Reasoning: I don't want to rely on "mostly unique" always guessing correctly, and we don't dynamically determine that yet. Which means that (in the single query version) if you'd have a per-event value for some tag, you could end up iterating over as many values as there are events, which won't work. * in tags.py, do the tab-check first to avoid doing the tag-calculation twice. * further denormalation (of key__key, of value__str) actually turns out to not be required for both the grouping and indivdual queries to be fast. Performance tests, as always, against sqlite3. -- Roads not taken/background * This commit removes a future TODO that "A point _could_ be made for ['issue', '?value?' 'count']", I tried both versions of that index (against the group-then-query version, the only one which I trust) but without denormalization of key, I could not get it to be fast. * I thought about a hybrid approach (for those keys with low counts of values do the single-query thing) but as it stands the extra complexity isn't worth it. --- on the 1.2M events, 3 (user defined) tags / event test env this basically lowers the time from "seconds" to "miliseconds".	2025-03-12 14:14:05 +01:00
Klaas van Schelven	b031792784	Event (tag) search: performance improvement Done by denormalizing EventTag.issue, and adding that into an index. Targets: * get-event-within-query (when it's 'last' or 'first') * .count (of search query results) * min/max (for the first/prev/next/last buttons) (The min/max query's performance significantly improved by the addition of the index, but was also rewritten into a simple SELECT rather than MIN/MAX). When this code was written, I thought I had spectacularly improved performance. I now believe this was based on an error in my measurements, but that this still represents (mostly) an improvement, so I'll let it stand and will take it from here in subsequent commits.	2025-03-12 14:11:43 +01:00
Klaas van Schelven	14b99c3880	assertEquals -> assertEual (Python 3.12) <<insert remarks about fashion police>> yes this isn't the first time	2025-03-10 15:45:12 +01:00
Klaas van Schelven	3ee6f29f9c	tags: fix the indexes this is the part I was able to do with careful reading (and rerunning the tests); actual performance implications will be checked based on this	2025-03-07 20:59:21 +01:00
Klaas van Schelven	f8113916dd	Fix the tests literally: the tests were always broken; in `39bddb14b7` I never ran the tests before comitting	2025-03-07 20:46:11 +01:00
Klaas van Schelven	0ade3c0f86	Add a comment about DB-CASCADE	2025-03-07 16:35:36 +01:00
Klaas van Schelven	96e07c4dc3	Tags: delete EventTag when Events are evicted and document related things	2025-03-07 13:50:10 +01:00
Klaas van Schelven	994e218e27	Notes on not-implemented tags	2025-03-06 15:58:54 +01:00
Klaas van Schelven	39bddb14b7	handled: searchable as a tag also: don't display this in the detail view when the value isn't actually in the data	2025-03-06 15:19:55 +01:00
Klaas van Schelven	0ad87be045	Notes on limitiations of TSTTCPW full_text search	2025-03-06 14:38:16 +01:00
Klaas van Schelven	2c9d5c80ed	Search: support for quoted values also adds tests and factors out the query parsing	2025-03-06 11:23:18 +01:00
Klaas van Schelven	1fa7436b2d	Search: factor out commonalities between issue and event search	2025-03-06 09:34:53 +01:00
Klaas van Schelven	fbb021ee2f	Testcase for Search	2025-03-06 09:26:43 +01:00
Klaas van Schelven	20a54381dc	Refactor: move tags/search stuff to its own module	2025-03-06 09:26:35 +01:00
Klaas van Schelven	3d62fba8e9	Add (some) tests for deduce_tags	2025-03-05 20:21:20 +01:00
Klaas van Schelven	c8ecf508de	Tags: on event details page show calculated tags (not just the explicitly provided ones)	2025-03-03 11:29:07 +01:00
Klaas van Schelven	406472b6d4	os.version as a tag	2025-03-03 10:55:15 +01:00
Klaas van Schelven	33ed3242c2	Fix browser.version tag-deduction	2025-03-03 10:54:17 +01:00
Klaas van Schelven	00c49443eb	Add 'mostly_unique' property to tags	2025-03-03 10:52:28 +01:00
Klaas van Schelven	1b9a76bc17	Add comment about counting tags occurrences	2025-03-03 09:41:00 +01:00
Klaas van Schelven	f7e85b788c	Tags: count at issue-level (model impl. only)	2025-02-27 21:58:14 +01:00
Klaas van Schelven	4404538893	Storing tags in the DB: use bulk_create; a few tests	2025-02-27 20:41:36 +01:00
Klaas van Schelven	89d53a15c9	digest_tags: just use project_id to avoid lookups maybe maybe the passed-in event also avoids this, but the present method will always do what you expect and this is obvious upon reading, rather than have the reader think about something	2025-02-27 16:21:29 +01:00
Klaas van Schelven	0d4d95defd	Use .iterator() idiom in init_tags to avoid loading all events into memory before looping over them There's some warnings in the docs about SQLite and iterators, but I don't think those apply, for one because we don't write to the events: https://docs.djangoproject.com/en/5.1/ref/databases/#sqlite-isolation	2025-02-27 14:54:26 +01:00

1 2

53 Commits