Skip to content

Performance Drop Issues (AIDA test A) #3

Description

@liehe

According to the paper, the performance PBoH on AIDA test A is 86.63/85.48. Due to the upgrade of gerbil, the performance of PBoH is give here is 75.19/73.3.

However, when try to reproduce the result, it gives the following result (64.84/64.32).

############### RESULTS for dataset AIDA test A for
TEST w = loopybeliefpropagation.ScorerWeights@3cb5cdba params a = 0.5, f = 1.0, g = 0.5, h = 1.0, s = 0.0, b = 0.075 #################
Num total docs = 216
Num total mentions (including duplicates) = 4781

Looking at docs with GLOBAL mentions:
GLOBAL mentions : num docs evaluated = 216; num mentions in solution = 4065.0 num mentions in ground truth = 4781.0

#################################
GLOBAL mentions : micro F1 (per mention) Loopy : 64.84286683246664
GLOBAL mentions : micro accuracy/recall (per mention) Loopy : 59.987450324199955
GLOBAL mentions : MACRO F1 (per doc) Loopy : 64.3229987260122
GLOBAL mentions : MACRO accuracy/recall (per doc) Loopy : 58.679661769593
###################################

GLOBAL mentions : micro F1 (per mention) ARGMAX : 62.85326701333936
GLOBAL mentions : micro acc/recall (per mention) ARGMAX : 58.1468312068605
GLOBAL mentions : MACRO F1 (per doc) ARGMAX : 61.799181989071386
GLOBAL mentions : MACRO acc/recall (per doc) ARGMAX : 56.37695767780255

GLOBAL mentions : MACRO (per doc) common Loopy - ARGMAX : 92.41201422346343
GLOBAL mentions : micro (per mention) common Loopy - ARGMAX : 94.98154981549816
GLOBAL mentions : micro (per mention) perc missing mentions from index : 14.975946454716587
GLOBAL mentions : micro (per mention) perc missing entities from mention index : 17.025726835390085


GLOBAL mentions : avg LBP running time (milliseconds) : 71.80092592592592
GLOBAL mentions : avg num iters in LBP : 2.6805555555555554
GLOBAL mentions : percentage cases where LBP converged: 100.0
GLOBAL mentions : avg num candidates per mention: 5.67029883619935

==============================================

  1. I used the index file from polybox. The location are index files are updated.

  2. I changed from

    val file = "/media/hofmann-scratch/Octavian/entity_linking/marinah/AIDA/testa_testb_aggregate"

to "AIDA-YAGO2-dataset.tsv" which is generated by files downloaded from MPI-info.
3. I use

java -Xmx90g -cp target/PBoH-1.0-SNAPSHOT-jar-with-dependencies.jar el.EL_LBP_Spark testPBOHOnAllDatasets max-product

to run the code because the command

scala -J-Xmx90g target/PBoH-1.0-SNAPSHOT-jar-with-dependencies.jar testPBOHOnAllDatasets max-product

will generate a UnstaisfiedLinkError when it trys to use leveldbjni.

Did I made any mistakes in the process? How can I reproduce the result in Gerbil?

Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions