Skip to content

Problems with Maratona Mineira #492

@rsalesc

Description

@rsalesc

Wall time limit was too tight

What went wrong?

Languages such as Java and Kotlin have an initialization overhead because of the JVM. This means they will usually take program run time + C seconds to run, and this time usually involves a bunch of syscalls and interruptions that are prone to CPU rescheduling, which increase the wall clock time solutions in these languages take to run.

This is further amplified by the fact we use multiple runs in BOCA. Thus, if we have a problem with M runs, we pay for this overhead M times.

In rbx, we set the WTL as 4*TL. This is usually fine, but when you have a mixture of (a) autojudge under external heavy load, (b) small TL with multiple runs, things can go wrong.

In Mineira, we had both because the autojudge runs in the same machine as the BOCA server and UI. This puts the autojudge under external load, subject to a lot of CPU rescheduling, and potentially under a different level of stress than what it was subject to one day before when the time limits were estimated.

Why this wasn't an issue in PdA and Maratona Cearense?

PdA and Cearense both run in the Maratona's servers, which (AFAIR) are configured with multiple servers.

How to solve?

Wall time limit should probably be configurable per contest, as different environments might require different limits. Unfortunately, we can't simply set this to a super high value because users could make the queue hang by (intentionally) sleeping their programs or by (unintentionally) writing hanging interactive programs.

Thus, there should be a way to choose a sensible WTL. One way of exposing this could be a WT = aT+b formula, where T is the time limit of the problem. Users could configure both a and b.

Also, ideally autojudges should be run in a separate machine.


Unrepresented languages got no specific TLs

What went wrong?

Languages that had no solutions for a problem will usually default to the non-language-specific TL in .limits/boca.yml.

Let's suppose we have a problem with solutions in C++, Python and Kotlin. If we run rbx time for those and set specific times for these ALL THESE THREE languages, languages such as C and Java would fall back to a non-reasonable time limit, which could cause unexpected problems in the contest, such as these languages having tight itme limits.

It's hard to notice this happens because when you set specific time limits for all languages, you end up with a totally different default/fallback.

Why this wasn't an issue in PdA and Maratona Cearense?

Both systems used either an unified TL or had solutions for both Java and Kotlin. As for the C/C++ issue, they kept both in the default TL bucket.

How to solve?

We should

  1. Let users bucketize languages when estimating TLs (such as C/C++ and Java/Kt). Ideally, this should live in the env.
  2. Not let users specify TLs for ALL languages as it happened in this problem.

BOCA language name mismatch

What went wrong?

In older versions of BOCA, cpp is actually called cc. Both are actually supported by rbx, so the package is built for both language variants, with very similar flags.

The problem is that the TL for cpp is taken from the actual cpp rbx language, but cc is not found in .limits/boca.yml, and then the TL is taken from the fallback value in .limits/boca.yml, which is potentially different.

Why this wasn't an issue in PdA and Maratona Cearense?

Both competitions run under the new BOCA.

How to solve?

This can be somewhat more easily solved with the introduction of #453

We should simply assume cc == cpp and take time limits from the actual underlying rbx entry.

Additionally, similar to the problem above, we should NOT let users specify TLs for all languages.


BOCA multiple runs + rounding

Although BOCA/safeexec supports specifying fractional time limits, for some historical reason, packages are usually built with integer time limits and multiple runs. For instance, if a problem TL is 0.1s, we can simply run it 10 times with the BOCA TL of 1s.

This works just fine when TLs are round, such as with 0.2s, 0.5s, 1.5s, as in these cases we can simply specify a small, reasonable number of runs without losing precision over the original time limit.

Also, it's fine to specify in the infosheet that the TL is the actual fractional value, as in practice, running a program N times with N*TL is equivalent.

Things get fuzzy when the estimated TL is not so pretty, such as 0.3s, 0.7s, 1.2s or anything like this. We can still run the program 10 times, but there are heuristics in place that consider this too much, and end up falling back to a small integer time that would allow for an error of at most X% on the original TL.

This heuristic is inherited from box for feature parity.

This caused problems because rbx estimates TLs based on a (customizable) formula that defaults to allowing for TLs with 0.1s resolution (i.e. multiples of 0.1s). TLs such as 1.2, 3.9s would then get approximated while allowing for errors, which would in practice change the TL.

Why this wasn't an issue in PdA and Maratona Cearense?

I actually have no idea whether this happened in Cearense. In PdA, this didn't happen for one very specific reason: we changed the time limit formula, and thus the time resolution, to 0.5s. Thus, there was no TL that was not a multiple of 0.5s. This is a very happy case, where almost all problems can evaluated with a precise TL with either 1 or 2 runs.

How to solve?

Get rid of the time approximation entirely and introduce a new timing mechanism. Allow for fractional time limits in all packages. If the setter wants, they can establish a minimum running time for their problems.

For instance, one can set the minimum running time to 1s, and problems with 0.3s will be run 4 times with a 1.2s TL. This is fine and equivalent, but we should never approximate/round/allow for TL errors. This is a box feature that should NOT be ported to rbx.


Additional issues

  • We should have a preset for Maratonas, that have specific defaults tuned to BOCA. The default preset is not suited for this.
  • We should NOT be able to build a BOCA time sheet if problems don't have a .limits/boca.yml, this is non-intuitive.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bocastatementstimingumbrellaUmbrella for issues related to a specific event or use case

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions