Joker_vD 2 days ago

> Delete in batches (as opposed to one-by-one), and without moving the event objects out of the database server – try to avoid loading them in the cleaning process before removing them, and don’t bother sending removed items back over the wire.

Nah, I am quite sure

    for item in conn.query('SELECT * FROM event'):
        if item.created_at < created_at_horizon:
            conn.exec('DELETE FROM event WHERE ' + ' AND '.join(f'{k} = :{k}' for k in item.__dict__.keys), item.__dict__)
is fine, so I won't even bother measuring it. It's also secure, don't worry: as you can see I used prepared statements in it.
kelseydh 2 days ago

I've run into this problem as well but I'm reticient to embrace deleting the events table because that data is the very heart and soul of what we are doing with information technology: tracking facts and values.

(Facts are immutable, values change with time: https://www.youtube.com/watch?v=D6nYfttnVco )

I would much rather a solution where I can offload old events into some other storage system, so that it doesn't clog my database but the information is still retrievable.

  • edmundsauto 2 days ago

    Events tables can also be privacy nightmares, depending on how stringent your industry's requirements are. Not clear ownership, really hard to deprecate events, one of the worst kinds of tech debt in my experience.

    Don't collect data if you don't have a clear purpose for it, IMO.

  • bob1029 2 days ago

    > I would much rather a solution where I can offload old events into some other storage system

    At large enough scale, it becomes mandatory to separate the event bus from the system of record. I witnessed this in semiconductor manufacturing. There was a gigantic transaction table of finalized things, and all of the detail event data was on a separate messaging bus/system with its own log files. The raw event data was on the order of 300-400 megabytes per hour per mfg line.

    Even with this separation, the high level table was still very scary to use. An unconstrained "SELECT *" would invariably result in a visit from the on-call DBA.

    • AndyNemmity 2 days ago

      I don't even thing the scale has to be that large. You can't hold the event bus, and the system of record.

      I was going to post about this very topic, but you accurately pointed it out.

orf a day ago

Why have a “created_at” column if you’re using ULIDs? Isn’t that encoded as part of the ULID itself?

danjc a day ago

Recommend creating a table per utc day for logs. Maintenance is a matter of dropping old tables.