Skip to content

bug: Thread-safety issues between packet handlers and entity state #385

@andreionoie

Description

@andreionoie

What happened?

Context

Current architecture allows concurrent access to mutable Entity state from the main tick loop (GameServer.Tick(), 100Hz) and multiple async network threads (PacketHandlers).

Basic example: separate threads calling Entity.Attack()

This runs on the main "world update" thread within a tick:

This runs on a separate packet handler thread:

Entity.Attack() decrements Entity.Health directly:

Similar issue for: PositionX, PositionY, PlayerData, Rotation, State, Target, and Collections (NearbyEntities, TargetedBy)...

Some failure scenarios

  1. race conditions:
    1.1. Main thread vs Network thread: concurrent health regen + damage results in lost updates.
    1.2. Network thread A vs Network thread B: two players attacking the same target concurrently can corrupt state.
  2. non-transactional updates ("check-then-act"):
    2.1. Packet handler verifies !Target.Dead.
    2.2. Main thread executes Target.Die(): sets Dead = true, removes from Map.
    2.3. Packet handler executes Damage() on a now dead/removed entity.
  3. server crashes:
    3.1. Entity.NearbyEntities is List (not thread-safe).
    3.2. Handler iterates list to send packets vs Main Thread adding entities => System.InvalidOperationException (Collection modified during enumeration)..

Possible fixes

  1. changing property access
    1.1. volatile (i.e. System.Threading.Volatile) - doesn't prevent actual race conditions
    1.2. mutex / monitor locks (i.e. System.Threading.Lock) - probably too slow, need to remember to use locks everywhere, deadlock risk
    1.3 atomics (i.e. System.Threading.Interlocked) - would need to refactor properties to be Interlocked? awkward usage in code
  2. main thread "Action Queue" that invokes lambdas/closures enqueued from the packet handlers (i.e. something like the existing EventSystem)
    • GC pressure if enqueuing closures causes heap allocs: maybe can pass structs instead of lambdas to the queue?
    • packet handlers will basically only validate packet data and not execute actions anymore

How to reproduce?

No response

Stacktrace

No response

OS Platform

Other (please specify in the additional information)

Additional Information and Notes

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions